180: Leveraging Global Genetics Resources for Equitable Polygenic Prediction

Show Notes

️ Episode 180: Leveraging Global Genetics Resources for Equitable Polygenic Prediction

In this episode of PaperCast Base by Base, we explore how multi-ancestry genome-wide association study resources and modern polygenic score methodologies can improve prediction accuracy across African, East Asian, and European populations, with a focus on practical, computationally efficient strategies that work even when individual-level data are unavailable.

Study Highlights:
This article systematically benchmarks leading single-source and multi-source polygenic score methods across 10 complex traits using GWAS summary statistics from Ugandan Genome Resource, Biobank Japan, UK Biobank, and the Million Veteran Program. The authors show that combining ancestry-aligned and European GWAS improves prediction in non-European targets and that independently optimized multi-source approaches often outperform jointly optimized methods while being far more computationally efficient. They introduce a generalizable use of the LEOPARD framework to estimate optimal linear combinations of population-specific scores using only summary statistics, achieving performance comparable to individual-level tuning in many settings. All methods are implemented in the GenoPred pipeline, providing an accessible, reference-standardized workflow for equitable polygenic prediction across diverse populations.

Conclusion:
Multi-source, summary-statistics–friendly approaches implemented in GenoPred offer a practical path to more accurate and equitable polygenic prediction, particularly when leveraging diverse GWAS resources and efficient tuning frameworks like LEOPARD.

Reference:
Pain O. Leveraging global genetics resources to enhance polygenic prediction across ancestrally diverse populations. Human Genetics and Genomics Advances. 2025;6:100482. https://doi.org/10.1016/j.xhgg.2025.100482

License:
This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/

Support:
If you'd like to support Base by Base, you can make a one-time or monthly donation here: https://basebybase.castos.com/

Chapters

(00:00:14) - Placing genetic science to better health
(00:02:34) - PGS: The science of genetics
(00:06:31) - The comparative methods of ancestry science
(00:07:43) - Sumstat Tune: The genomic precision benchmark
(00:10:06) - LDpred2: Multi-Source Analysis
(00:11:30) - EUR vs AFR GWAS: The Size Paradox
(00:13:03) - The Best Multi-Source PGS
(00:17:09) - The Fight for Equitable Genomic Prediction
(00:21:14) - Personalized medicine in the cloud

Episode Transcript

[00:00:14] Speaker A: Welcome to Base by Base, the papercast that brings genomics to you wherever you are. I want to start today with a powerful idea. Personalized medicine. We often talk about using an individual's DNA to predict future disease risk for things like heart conditions or high cholesterol using tools called polygenic scores or PGS's. [00:00:35] Speaker B: They're essentially like genetic crystal balls trying to look into the future of someone's health based on their genes. [00:00:41] Speaker A: Exactly. But here's the problem right now. That crystal ball, it's. Well, it's clearest for people of European ancestry. [00:00:49] Speaker B: Yeah. For pretty much everyone else, it's often blurry or sometimes, frankly, unreliable. [00:00:54] Speaker A: The core issue here, it's systemic, isn't it? The vast majority of genetic information used to build these really powerful PGS tools comes from individuals of European ancestry. Eur, as they're often called. [00:01:05] Speaker B: Right. And the consequences, when we try to apply these predictors to non EUR groups, like populations with African or East Asian ancestry, the accuracy just drops, sometimes substantially. [00:01:15] Speaker A: Which can render them almost useless in a clinical setting. For those groups, it's a huge disparity, a massive roadblock if we're aiming for genuine global health equity. [00:01:25] Speaker B: Absolutely. It really undermines the personalized aspect of personalized medicine if it only works well for one group. [00:01:32] Speaker A: So, okay, let's unpack this. Can we actually harness the, you know, the rapidly growing global genetic data sets, things emerging from places like Uganda or Japan? Can we use that data to fix this predictive gap? [00:01:48] Speaker B: That's the key question. And maybe just as importantly, can we do it efficiently? Because access to certain types of data is a real challenge. [00:01:55] Speaker A: Right. So today we're diving into some really ingenious new methods that are trying to close this ancestral gap in prediction accuracy, maybe without needing all that restricted, hard to access data. [00:02:05] Speaker B: And on that note, today we should really celebrate the crucial work of Oliver Payne and his team. They're based at the Maurice Vol Clinical Neuroscience Institute at King's College London. Their research is, I think, fundamentally advancing our understanding of how we can leverage these global genetics resources to really enhance polygenic prediction across ancestrally diverse populations. It's vital work. [00:02:28] Speaker A: Definitely vital. Okay, so before we get into their specific innovative approach, let's set the stage a bit more. You mentioned PGS is these genetic crystal balls. [00:02:37] Speaker B: Yeah. [00:02:37] Speaker A: But let's talk utility. What exactly are they and why do we actually rely on them so much? [00:02:42] Speaker B: Sure. So PGS is basically aggregate, the tiny effects of often thousands, maybe even millions of genetic variants all across the genome. Okay. And they use this to estimate an individual's overall genetic predisposition for complex traits like height or diseases like schizophrenia or heart disease. Think of it like a massive weighted sum where each variant contributes a little bit. [00:03:04] Speaker A: Right, A weighted sum. And this genetic risk stratification. What's the point? [00:03:08] Speaker B: Well, it's potentially vital for proactive prevention. You know, identifying people who are at high risk before a disease actually strikes. [00:03:15] Speaker A: Ah, okay. Early warning. [00:03:16] Speaker B: Exactly. And also for tailoring targeted treatments. If you know someone has a high genetic risk, maybe you monitor them more closely or suggest different lifestyle changes. [00:03:27] Speaker A: That makes sense. The promise is huge. But as we said, it hits this wall, this central problem. The overwhelming overrepresentation of EUR ancestry in the foundational studies. [00:03:40] Speaker B: The GWES. Right. Genome wide association studies. [00:03:43] Speaker A: That's right. GWESs are the studies we use to figure out the effect size, or the weight as I called it, for each specific genetic variant included in the score. [00:03:52] Speaker B: So if the input data is skewed. [00:03:54] Speaker A: Towards one ancestry, then the resulting prediction model is inevitably skewed too. If you build a score mainly using data from Eeyore populations and then try to apply it to say, African or AFR populations or East Asian EAS populations, the predictive power just drops significantly. [00:04:10] Speaker B: And we're not talking about a small statistical quirk here. [00:04:12] Speaker A: No, absolutely not. It means we are genuinely reducing the clinical usefulness of these scores for large parts of the world's population. And that actively increases global health inequality. It's a serious issue. Okay. So to fix this, researchers have been developing different kinds of PGS methods. You mentioned two main categories. [00:04:30] Speaker B: Yeah, broadly speaking. The first is what we call single source methods. These are, well, relatively straightforward conceptually. They derive all the variant weights from just one big G ways. So you might use a method like say PRSCs or run purely on data from the UK Biobank. One source population. [00:04:51] Speaker A: Got it. Simple enough. And the second category, that's where things. [00:04:55] Speaker B: Get more interesting and potentially more equitable. These are multisource methods. [00:04:59] Speaker A: Multisource, okay. [00:05:00] Speaker B: As the name suggests, these try to incorporate information from multiple G ways, often from different populations. The goal is to build a more robust, more globally relevant prediction score. [00:05:11] Speaker A: That sounds like the way forward, right? Combine the data. [00:05:13] Speaker B: It does. But this is where a major technical hurdle pops up. To get the best results from these multi sort pgss, you often need to do something called tuning. [00:05:22] Speaker A: Tuning? Like tuning an instrument? [00:05:24] Speaker B: Sort of, yeah. Tuning here involves linearly combining those different population specific scores. So you might have one score derived from UR data, another from afr. Data tuning means finding the mathematically optimal recipe for mixing them together for the specific target population you're interested in. [00:05:43] Speaker A: Okay, finding the best blend. Why is that a hurdle? [00:05:46] Speaker B: Because traditionally, finding that optimal recipe, that best blend, requires access to individual level genotype and phenotype data from the target population. [00:05:56] Speaker A: So if you want to tune a score for say an African population, you need the actual genetic and health records for a sample of individuals from that population. [00:06:05] Speaker B: Precisely. And that kind of individual level data is often just not available. It might be restricted due to privacy regulations, ethical concerns, or sometimes it just doesn't exist in a readily usable format. [00:06:16] Speaker A: Right, so that creates a huge practical barrier to actually implementing these potentially better multisource scores around the world. [00:06:23] Speaker B: Exactly. The traditional multisource methods are powerful in theory, but they're often crippled by these very real data access issues in practice. [00:06:31] Speaker A: Okay, so that brings us to the study we're focusing on. The mission then was pretty clear. Let's systematically compare all the current major methods, both single source and multisource, across different ancestries. [00:06:43] Speaker B: Yeah, they looked at AFR, EAS and EUR ancestries and they did this for 10 different complex traits, things like BMI, height, cholesterol levels, stuff like that. [00:06:53] Speaker A: The goal being to see which method is truly the best and maybe crucially the most scalable, the most practical, given those data access problems. [00:07:00] Speaker B: Precisely. They set up a really rigorous comparison. For their African ancestry data. They used GWS summary statistics. That's the key. Summary statistics. Not individual data from the Ugandan genome resource, the UGram. [00:07:13] Speaker A: Okay. [00:07:13] Speaker B: UGR for AFR for East Asian ancestry EAS, they used Biobank Japan or BBJ. [00:07:19] Speaker A: BBJ for EAS. [00:07:21] Speaker B: Got it. And for the sort of standard European ancestry comparison, they used the UK biobank ukb. [00:07:26] Speaker A: UKB for er. [00:07:27] Speaker B: And they also did a clever sort of sensitivity analysis using data from the huge million Veteran Program or MVP in the us. This let them test their methods against a much, much larger AFR dataset as well to see how sample size played a role. [00:07:43] Speaker A: Smart. So what methods did they actually throw into this comparison? [00:07:47] Speaker B: It was quite comprehensive. They tested a total of 12 methods. Eight were these classic single source approaches, including some well established ones like LDPRED2 and Sbase RC. [00:07:57] Speaker A: Okay, the standard single source tools. [00:07:59] Speaker B: And the other four were these cutting edge multisource methods like PRS, CSX and X Wing, which are designed to combine information across ancestries. [00:08:07] Speaker A: Right, but you mentioned innovation, something that tackles that tuning problem. [00:08:11] Speaker B: Yes, exactly. This is where their key innovation lies. The real aha moment that potentially solves that tuning problem we discussed. [00:08:19] Speaker A: Okay, tell me more. [00:08:20] Speaker B: They recognize that some of these computationally intensive multisource methods, specifically one called X Wing, actually contain within them a really clever component called Leopard. [00:08:29] Speaker A: Leopard, like the animal? [00:08:31] Speaker B: Haha, yeah, I think so. Leopard is designed to estimate optimal tuning parameters as part of that bigger X Wing process. Oh, okay. So what this team did was kind of brilliant. They decided to essentially pull Leopard out of that complex computing, computationally heavy X Wing pipeline and apply it to a much simpler approach, just linearly combining independently optimized single source pgss. So you make the best possible AFR score, the best EAS score the best EUR score separately first. [00:09:03] Speaker A: Okay, make the port separately. [00:09:05] Speaker B: Then you use Leopard paired with an efficient calculation method called Quick PRS to figure out the best way to mix those already optimized scores together. And the key, the key is that this novel combination, which they called Sumstat Tune, achieves that optimal mixing, that tuning using only summary statistics. [00:09:22] Speaker A: Ah, no individual level data needed for the tuning step. [00:09:25] Speaker B: Exactly. It completely bypasses the need for that restricted individual level data. You just need the summary results from the different GWAs. [00:09:32] Speaker A: Wow. Okay. That is the technical breakthrough in practice then. Sumstat Tune lets researchers blend the best existing scores from, from different ancestry sources without needing a single patient's private data from the target group. As long as they have the summary stacks. [00:09:47] Speaker B: Precisely. It's a massive step towards, you know, democratizing genomic prediction, making these advanced methods usable by more people, more labs. [00:09:56] Speaker A: And they implemented this whole system using something called a genopred pipeline. [00:10:00] Speaker B: That's right, an open source pipeline, which makes it even more accessible for other researchers to use and build upon. [00:10:06] Speaker A: Okay, this is where it gets really interesting. The results, what did combining sources actually reveal about prediction accuracy? Especially, you know, for the historically underserved populations, AFR and eas? [00:10:19] Speaker B: The results were, I'd say, pretty definitive on one point. Multi source approaches are basically non negotiable if you want equitable prediction. [00:10:27] Speaker A: Non negotiable. Wow. [00:10:28] Speaker B: Yeah. They significantly improved accuracy over any single source method when applied to the non EUR target populations. We're talking about major percentage gains here. [00:10:37] Speaker A: Can you give an example how much better? [00:10:38] Speaker B: Sure. So for instance, taking a good single source method like ldpred2 and making it multi source. So ldpred2 multi. Using data from multiple ancestries improved prediction accuracy by about 21.8% in the African target populations. [00:10:54] Speaker A: Wow. Over 20% better. [00:10:55] Speaker B: And by 14.0% in the East Asian populations. That's compared to just using the standard single source LDpred2 method trained only on the non EOR population data itself. [00:11:06] Speaker A: Think about that. A nearly 22% gain in accuracy in African populations. That could translate directly into preventing misclassification for thousands, maybe millions of patients down the line. [00:11:17] Speaker B: Absolutely. It's a substantial improvement in predictive power. [00:11:20] Speaker A: But you mentioned the million veteran program data. Did that reveal anything else? There's always a but, isn't there? [00:11:26] Speaker B: Well, there's a nuance, definitely. That 21.8% improvement is huge. But we have to talk about what they called the size paradox. [00:11:32] Speaker A: The size paradox. Okay, what's that? [00:11:34] Speaker B: It's this finding in the AFR target, specifically when they were using the GWS summary staps from the Ugandan Genome resource, which is relatively speaking, a smaller study. [00:11:44] Speaker A: Compared to UK Biobanks. [00:11:45] Speaker B: Exactly. In that scenario, the massive EUR trained pgss still often outperform the ancestry aligned AFR trained pgss. Even the single source ones built just on UGR data. [00:11:58] Speaker A: Wait, so the score built on European data worked better for the African target group than the score built on African data from Uganda. How does that work? [00:12:06] Speaker B: It boils down to the sheer volume of data statistical power. [00:12:10] Speaker A: Ah, sample size. [00:12:11] Speaker B: Yep. The UK Biobank has data on hundreds of thousands moving towards millions of individuals. The ugr, while incredibly valuable and ancestrally relevant, is much smaller. [00:12:22] Speaker A: So the raw statistical power from the huge EUR study could kind of overwhelm the benefit of using the more appropriate but smaller AFR study data. [00:12:31] Speaker B: That seems to be what's happening in some cases, especially for traits where the genetic architecture might be more shared across populations. It underscores a hard truth for researchers. While matching ancestry is theoretically better, if your ancestry matched GWAS is small, the sheer statistical might of a large, even if ancestrally different GWAS can sometimes still win out, at least for now. [00:12:50] Speaker A: So data size still matters a lot immensely. [00:12:54] Speaker B: Which really highlights the need for larger gss in diverse populations too. [00:12:59] Speaker A: Okay, so multisource is better overall, but sample size is critical. Now, within those multi source methods they tested, the complex ones like X wing, and the simpler approach of combining independent scores, which one actually delivered the best results in practice? [00:13:14] Speaker B: And here's maybe the biggest surprise of the study. The simpler approach. [00:13:18] Speaker A: One really. The linear combinations of the independently optimized single source scores. [00:13:23] Speaker B: Yeah, those consistently outperform the current jointly optimized methods. The ones like prs, CSX and X Wing that try to model everything together from the start. Huh. [00:13:33] Speaker A: So the best strategy Today isn't necessarily to try and mash all the raw data together in one go with these super complex models? [00:13:40] Speaker B: Apparently not. Based on this evidence, it seems it's better to separately derive the highest quality population specific scores you can. First one optimize for AFR, one for EAS, one for EUR using the best single source methods available. [00:13:51] Speaker A: Like SBAES, RC or LDpred2. [00:13:54] Speaker B: Exactly. And then use their novel tuning method SumStatune, that Lepbard QuickPRS combo, to blend those high quality components together optimally using just the summary statistics. [00:14:04] Speaker A: And that's Sumstatune's solution, the one that avoids needing individual data. How well did it actually work? Compared to the gold standard of tuning. [00:14:12] Speaker B: With individual data, it was a massive practical success. It performed almost identically, especially when they had access to larger GWS data like when they used the million veteran program AFR data for testing. It really proved that you you don't necessarily need that restricted individual level data to get top tier tuning results for these combined scores. [00:14:31] Speaker A: That's huge for practicality. [00:14:33] Speaker B: It really is. And another key finding that jumped out related to making those good single source components, the method ESPAZRC really stood out. [00:14:41] Speaker A: Oh yeah? How so? [00:14:43] Speaker B: Particularly when it was applied to the large GWS datasets, the EUR data from ukb, the EAS data from BBJ and the larger AFR data from mvp. It provided very high accuracy. And crucially, it doesn't require any individual level tuning data itself to run effectively. [00:15:00] Speaker A: So S Phase RC is a strong performer on its own with big data sets even before you think about combining scores. [00:15:06] Speaker B: Yes, it seems to be a very robust and accurate choice if you have large summary statistics to work with. [00:15:13] Speaker A: Okay, let's pause and analyze the. So what here? What does all this mean? These findings seem to provide a sort of validated general framework for building multisource pgss, right? [00:15:24] Speaker B: You think so? The mandate seems clear. If we want prediction to be equitable across ancestries, multisource approaches really have to be the default going forward. We can't just rely on single ancestry scores anymore. [00:15:37] Speaker A: And the reason those simpler independently optimized methods are winning right now. It's not just about accuracy, is it? You hinted at efficiency earlier. [00:15:45] Speaker B: Ah yes, efficiency. This is perhaps the biggest hidden breakthrough or maybe the most impactful finding for practical application in this entire study. [00:15:54] Speaker A: How much more efficient are we talking? [00:15:55] Speaker B: We are talking substantially more computationally efficient. It's a night and day difference. That's their real competitive advantage in the real world. [00:16:02] Speaker A: I think okay, give me the numbers. What's the difference in infrastructure required? [00:16:06] Speaker B: Okay, get this. A jointly optimized method like the original X Wing approach required about 34 hours. 34.1 hours to be precise, to process just one GWS dataset on their system. [00:16:20] Speaker A: Okay. Over a day of computation per data set. [00:16:23] Speaker B: Now compare that to their independent approach using Leopard plus quick PRs for the tuning Sumstat tune. That took only 14 minutes. [00:16:30] Speaker A: 14 minutes down from 34 hours. [00:16:32] Speaker B: Yep. From over a day down to basically a coffee break. [00:16:36] Speaker A: That is. That's astonishing. That's transformative, isn't it? [00:16:39] Speaker B: It completely flips the script. On practicality, it means that labs with maybe less computational horsepower, perhaps located in the very regions that most need these equitable pgss, can now actually access and apply apply these highly accurate multi source methods. [00:16:53] Speaker A: Right. Without needing to invest in supercomputers or massive cloud computing budgets. [00:16:57] Speaker B: Exactly. This efficiency makes the independent approach build the best components separately, then blend them smartly with some stat tune. Incredibly practical and frankly, infinitely more scalable for global research efforts. [00:17:09] Speaker A: And this computational advantage, does it suggest something about the state of the methods themselves? [00:17:13] Speaker B: I think it does. It suggests that the current advances in the single source methods tools like SBAZ, RC and the latest versions of LDpred2 have become so sophisticated and effective on. [00:17:24] Speaker A: Their own, that simply combining these powerful individually optimized scores linearly is now actually the strongest and fastest strategy. [00:17:33] Speaker B: That seems to be the case. Yes. At least for now, it's beating the much more time consuming joint optimization techniques in both accuracy and speed. [00:17:41] Speaker A: Okay, so based on all this hard evidence, can we actually give researchers some clear, practical recommendations? Now, if someone wants to build the best, most equitable pgs possible today, what should they do? [00:17:53] Speaker B: I think we can offer some pretty concrete advice based on these findings. First, if you are working in say, AFR or EAS populations, and you happen to be in the fortunate position of having that coveted individual level tuning data. [00:18:05] Speaker A: Available, which is rare but possible. [00:18:07] Speaker B: Right? In that case, the study suggests you should probably run both Sbase RC and LDPRED2 to create your ancestry specific scores and then just empirically select whichever one performs best on your tuning data. Cover your bases. [00:18:18] Speaker A: Okay, Test both if you have the tuning data. But what about the more common scenario? [00:18:24] Speaker B: Right? Let's be realistic. If you only have the publicly available summary statistics, which is the reality for most researchers globally, then sbirc alone looks like a particularly strong, accurate and robust choice. Especially if you're working with large GWS datasets. It performs well without Needing that extra tuning step itself. [00:18:43] Speaker A: So SBAZRC is a good default if you only have summary stats from a large study. And what about combining scores from different ancestries? [00:18:50] Speaker B: And this is the crucial part for equity. If you need to combine scores from different ancestries to maximize that equity and accuracy in your target population, definitely use the LEPR quick PRS approach. Use Sumstat Tune because. Because it delivers high efficiency. Remember 14 minutes versus 34 hours and gives accuracy that's comparable to the methods requiring that restricted individual level data. That's the key to making multisource scores actually viable and usable worldwide right now. [00:19:18] Speaker A: Makes sense. Now we should probably acknowledge limitations, right? No study is perfect. [00:19:23] Speaker B: Absolutely. The authors are clear about this too. They focused primarily on a specific subset of well behaved variants known as HapMap3 variants. And they didn't fully evaluate specific approaches designed for admixed individuals, people with recent ancestry from multiple continents. That's a growing and important area that needs more work. [00:19:41] Speaker A: Right? Admixed populations are a big challenge. [00:19:44] Speaker B: They are. But even with those limitations, the overall takeaway message feels really solid. While these computational fixes like Sumstat Tune are fantastic, truly game changing for bridging the immediate gap using the data we. [00:19:57] Speaker A: Have, this is a bigger picture issue. [00:19:58] Speaker B: Yes, the ultimate fundamental disparity in accuracy across ancestries that can only truly be closed by one thing, which is significantly expanding and improving the diversity of the primary high quality GWAS datasets globally. We just need more and better data from non European populations. The better the raw ingredients we feed into these methods, the better the final scores will inevitably become. [00:20:24] Speaker A: So the computational methods are crucial tools, but they can't fully substitute for fixing the foundational data imbalance. [00:20:30] Speaker B: Exactly. They are necessary but not sufficient on their own in the long run. [00:20:35] Speaker A: Okay, let's try to distill the central insight here then. What's the main take home message from this deep dive? [00:20:40] Speaker B: I think it's that the quest for equitable genomic prediction is currently making huge strides, perhaps surprisingly, through efficiency and smart computational shortcuts. [00:20:51] Speaker A: Right. Combining these independently optimized ancestry specific PGSS seems to provide the most accurate and computationally practical path forward today. [00:21:00] Speaker B: Yes, and the innovative application of the leopard method within SumStatune is really the key that unlocks the ability to successfully tune these powerful combined scores using only publicly available summary statistics. That's the practical enabler. [00:21:14] Speaker A: Okay, so here's a thought to leave our listeners with. What does this breakthrough in computational efficiency going from days to minutes for key steps, what does that truly mean for the scale and speed at which we might finally be able to deliver genuinely personalized and equitable medicine to global populations? [00:21:30] Speaker B: That's the million dollar question, isn't it? It certainly accelerates the possibility. This episode was based on an Open Access article under the CC BY 4.0 license. You can find a direct link to the paper and the license in our episode description. If you enjoyed this, follow or subscribe in your podcast app and leave a five star rating. If you'd like to support our work, use the donation link in the description. Thanks for listening and join us next time as we explore more science base by base.

Previous Episode Next Episode

180: Leveraging Global Genetics Resources for Equitable Polygenic Prediction

Show Notes

Chapters

Episode Transcript

Other Episodes

️ 93: Neuroinvasive and Pulmonary Virulence of H5N1 Clade 2.3.4.4b Genotype B3.13

110: Whole-exome sequencing identifies new schizophrenia risk genes

183: The Genetic Lottery Goes to School: Better Schools Compensate for Genetic Differences