Episode Transcript
[00:00:20] Speaker A: Welcome to Bass by Bass, the papercast that brings genomics to you wherever you are. Thanks for listening and don't forget to follow and rate us in your podcast app.
So imagine for a second that you decide to take one of those, you know, highly detailed, comprehensive genetic tests.
[00:00:35] Speaker B: Right, the ones that look at millions of genetic markers.
[00:00:38] Speaker A: Exactly. So you get the results back, and the report boldly predicts, based on all those common genetic markers, that you should be exceptionally tall. But you read this completely perplexed because, well, in reality, you are quite short.
[00:00:51] Speaker B: Yeah. Or consider a scenario that carries a lot more weight. Say the genetic test predicts you have an extremely high risk for severe heart disease.
[00:00:59] Speaker A: Oh, wow. Yeah.
[00:01:00] Speaker B: But year after year, your doctor confirms your arteries are just perfectly clear.
[00:01:04] Speaker A: So the big question is, what happens when your actual physical traits completely defy your genetic destiny?
[00:01:11] Speaker B: It's fascinating, really. I mean, we spend a tremendous amount of computational power building these massive predictive models in modern genomics.
Yet the extreme outliers, the individuals who just simply refuse to fit the mold, they're often brushed aside as statistical noise.
[00:01:28] Speaker A: Which is a shame.
[00:01:29] Speaker B: It really is. Treating them as anomalies is a huge missed opportunity because their defiance of genetic expectations is actually one of the clearest, most powerful biological signals we have for understanding how human biology truly works. Right.
[00:01:44] Speaker A: Think of your overall genetic background like a massive weather forecast. You have millions of common genetic variants coming together, creating a general prediction based on broad patterns.
[00:01:53] Speaker B: Like a high chance of rain or a heat wave.
[00:01:55] Speaker A: Exactly. Maybe the forecast indicates a high risk for a certain disease. Yeah, but suddenly, what happens when a highly localized, incredibly rare genetic lightning strike just completely overrides that entire forecast?
[00:02:07] Speaker B: That lightning strike is everything it is.
[00:02:10] Speaker A: So today, our mission in this deep dive is to explore the science of phenotypic misalignment.
We are going to look at what causes people to deviate from their genetically expected traits.
[00:02:22] Speaker B: And crucially, how studying these genetic rebels can unlock entirely new treatments for rare diseases.
[00:02:28] Speaker A: Today, we celebrate the work of Nicolas Baia, Duncan Palmer and their co authors from the University of Oxford, the Broad Institute, and their associated institutions.
[00:02:37] Speaker B: Yeah, they have really advanced our understanding of climate change complex disease architecture. Here. They created this powerful systematic framework to study the interplay between our common genetic background and those ultra rare, high impact variations.
[00:02:51] Speaker A: Okay, let's unpack this, because to really appreciate the magnitude of what this research team accomplished, we need to understand the baseline they were working from. Right, And I guess that starts with understanding how we currently predict genetic destiny using polygenic scores, or pgs.
[00:03:06] Speaker B: Right. So a polygenic score is essentially a mathematical way to quantify your genetic risk for a disease or a trait. And it's based in entirely on common genetic variants.
[00:03:14] Speaker A: And we all carry millions of these. Right?
[00:03:16] Speaker B: We do. And individually, a single common variant has almost no impact. Like, it might increase your risk of heart disease by just a fraction of a percent.
[00:03:25] Speaker A: Barely noticeable on its own.
[00:03:26] Speaker B: Exactly. But when you aggregate millions of them together into a single score, well, they function as a highly effective tool for stratifying a population's baseline disease risks.
[00:03:36] Speaker A: I like to think of this in terms of a personal financial budget.
Your common variance. So your polygenic score represent your steady, predictable baseline.
[00:03:45] Speaker B: Yeah. Your fixed monthly income.
[00:03:47] Speaker A: Right. Your income and your standard recurring expenses. It provides a very reliable forecast of your financial health over time. But the blind spot of a polygenic score is that it entirely ignores the rare variants.
[00:03:59] Speaker B: And that blind spot is exactly what this new framework addresses. Because polygenic scores are built strictly on common variants, they just cannot account for those rare, high impact mutations.
[00:04:09] Speaker A: Which brings us to the liability threshold model.
[00:04:11] Speaker C: Right.
[00:04:12] Speaker B: Yes. A really foundational concept in genetics.
Developing a complex disease is not just a simple light switch that flips on or off. Instead, it's a continuous scale of accumulating liability.
[00:04:23] Speaker A: So what makes up that liability?
[00:04:25] Speaker B: It's a combination of things. Your baseline common genetic variants, any rare genetic variants you might happen to carry, and. And of course, your environmental exposures.
[00:04:35] Speaker A: So picture a cup sitting under a dripping faucet.
[00:04:37] Speaker B: Right. The water level rising in the cup is your accumulating liability. You only actually develop the disease when that total liability crosses a specific threshold,
[00:04:47] Speaker A: basically causing the cup to finally overflow.
[00:04:49] Speaker B: Exactly.
[00:04:50] Speaker A: So returning to our budget analogy, the liability threshold model makes perfect sense. Your polygenic score is your stable baseline budget. But a rare, highly damaging genetic variant acts like a devastating, unexpected medical bill that completely ruins ruins your overall financial
[00:05:05] Speaker B: status in an instant, regardless of how stable your monthly budget was.
[00:05:09] Speaker A: Right. Or on the flip side, a rare protective variant acts like a sudden massive lottery win. It shields you from financial ruin, even if your baseline income is just terrible.
[00:05:19] Speaker B: That's a perfect way to put it. So the central challenge then, is figuring out how to actually find those rare genetic lottery wins and unexpected bills in a massive sea of data.
[00:05:31] Speaker A: And preventing disease relies on accurate modeling.
[00:05:34] Speaker B: It does. If we can understand why diagnosed individuals sometimes have an unexpectedly low common variant risk, we can uncover those hidden monogenic, meaning, single gene drivers of disease.
[00:05:46] Speaker A: So to do this, the research team must have needed just a staggering amount of data.
[00:05:49] Speaker B: Oh, Absolutely. They used the UK Biobank, focusing on over 400,000 individuals with European genetic ancestry.
[00:05:57] Speaker A: Wow, 400,000?
[00:05:58] Speaker B: Yeah. And they didn't just look at standard markers. They analyzed over 25 million specific variants derived from 450,000 exome sequences.
[00:06:07] Speaker A: For a little context, while whole genome sequencing looks at all your DNA, exome sequencing specifically isolates the regions of the DNA that actually code for proteins.
[00:06:17] Speaker B: Right. It's focusing purely on the functional machinery of the body.
[00:06:19] Speaker A: Which is exactly where those rare high impact mutations usually hide out. So what traits were they actually looking at with all this data?
[00:06:26] Speaker B: They targeted two specific types. First, they looked at seven continuous traits. These are things that exist on a
[00:06:32] Speaker A: spectrum like height and weight.
[00:06:33] Speaker B: Yeah. Standing height, bmi, bone mineral density and LDL cholesterol. And second, they looked at three dichotomous traits or case control traits.
[00:06:43] Speaker A: Binary conditions, essentially. You either have them or you don't.
[00:06:46] Speaker B: Exactly. Type 2 diabetes, coronary artery disease. And osteoporosis. Osteoporosis.
[00:06:51] Speaker A: But to find these outliers, I mean, they couldn't just look at absolute numbers. Right. They had to compare the genetic forecast against the real world.
[00:06:58] Speaker B: Right, the misalignment classification. This is the critical step.
[00:07:01] Speaker A: How does that actually work?
[00:07:02] Speaker B: Well, first they adjust the observed real world traits for basic covariates like age and sex.
[00:07:08] Speaker A: So an 80 year old isn't unfairly compared to a 20 year old.
[00:07:11] Speaker B: Precisely. Then they compare that adjusted real world number to the genetically expected phase phenotype, which is the polygenic score.
[00:07:18] Speaker A: Okay, so matching expectations versus defying them.
[00:07:21] Speaker B: Right. They grouped people into aligned, meaning their real traits matched their genetics. And misaligned, meaning they hit those high or low extremes.
[00:07:30] Speaker A: And once they isolated these genetic rebels, the misaligned group, they ran them through a really rigorous testing funnel to hunt for rare variants, Right?
[00:07:39] Speaker B: Yes, a three stage funnel. Stage one focused on canonical genesis.
These are the usual suspects. Genes we already know are linked to known monogenic disorders. And stage two, stage two broadened the search. Using the gel panel app genes, that's a larger expert curated list of diagnostic grade genes for rare disorders.
[00:07:59] Speaker A: Okay, and then the final stage, stage
[00:08:01] Speaker B: three, was a completely unbiased exome wide scan. They just searched across the entire exome for entirely new discoveries.
[00:08:09] Speaker A: But across all three of those stages, they had to define what damaging actually means. Right, because we all carry thousands of mutations.
[00:08:16] Speaker B: Yeah, but most of them are biologically silent. They don't do anything. So the team used strict prediction algorithms to look for two specific types of damage.
[00:08:24] Speaker A: What was the first one?
[00:08:25] Speaker B: Predicted loss of function variance or P. Lof Fs. This essentially breaks the gene entirely. You can think of it like tearing entire pages out of an instruction manual.
[00:08:34] Speaker A: So the protein just doesn't get made at all.
[00:08:36] Speaker B: Exactly. And the second type was damaging missense variants. These alter the protein structure, so it might not destroy it completely, but it severely impairs its function.
[00:08:47] Speaker A: Like a bad typo in the instruction manual.
[00:08:49] Speaker B: Yes, exactly.
[00:08:50] Speaker A: Okay, so what does this all mean for the data? If someone deviates from their expected cholesterol, couldn't it just be because they are taking a statin or eating poorly, rather than some rare genetic mutation?
[00:09:04] Speaker B: That is a great point.
[00:09:05] Speaker A: I mean, the environment seems like the easiest explanation for misalignment.
[00:09:09] Speaker B: It absolutely is. And the researchers knew this. They actually checked for this very thing. They found that non genetic factors like socioeconomic status, smoking, and yes, taking cholesterol lowering medications did heavily associate with misalignment.
[00:09:22] Speaker A: So there was a ton of noise in the data?
[00:09:24] Speaker B: A tremendous amount. But that makes the genetic discoveries they did make even more impressive.
[00:09:28] Speaker A: Oh, I see.
[00:09:29] Speaker B: Yeah, the environmental noise made the true genetic signal incredibly hard to find.
So the fact that they found these rare variants cutting straight through the noise of statins and lifestyle choices, it proves just how robust the genetic signal really is.
[00:09:43] Speaker A: The rare variant really? Is that blaring air horn drowning out the entire symphony?
[00:09:47] Speaker B: Exactly.
[00:09:48] Speaker A: So let's look at the actual findings, starting with the continuous traits and those canonical genes.
The usual suspects.
[00:09:56] Speaker B: The statistics here are just staggering. Let's take height. People who were significantly shorter than their common genetics predicted were highly enriched for those predicted loss of function variants in genes like ACAN, IGF1 and SHOX.
[00:10:09] Speaker A: ACAN specifically stood out, right?
[00:10:11] Speaker B: It did. The odds ratio for the acan gene was a massive 367.
[00:10:16] Speaker A: An odds ratio of 367. I mean, in human biology, researchers are usually thrilled to find a variant with an odds ratio of like 1.2.
[00:10:23] Speaker B: Oh, definitely.
[00:10:23] Speaker A: An odds ratio of 367 means that if you possess a broken variant in this gene, your mathematical likelihood of being Exceptionally short is 367 times higher than someone without it.
[00:10:34] Speaker B: It's an absolute biological sledgehammer. And it makes sense biologically. The acan gene provides instructions for cartilage structure. If you break it, your bone scaffolding is compromised, halting vertical growth, no matter what your other genes say.
[00:10:49] Speaker A: And what about people who are taller than expected?
[00:10:52] Speaker B: They were enriched for damaging missense variants in the FBN1 gene, which is linked to connective tissue elasticity.
[00:10:58] Speaker C: Wow.
[00:10:59] Speaker A: And what did they find regarding cholesterol?
[00:11:01] Speaker B: Well, people with lower than expected LDL cholesterol had a combined burden of broken variants in the PCSK9 and APOB genes,
[00:11:09] Speaker A: which are deeply involved in how the body clears cholesterol from the blood. Right?
[00:11:13] Speaker B: Exactly. And they also looked at bone mineral density, or bmd. This is a trait that had never really been studied this way before.
[00:11:20] Speaker A: Oh, really? What did they find there?
[00:11:21] Speaker B: People with lower than expected BMD were highly enriched for mutations in the COPB2. And gorgeous.
[00:11:28] Speaker A: And those are linked to a rare form of osteoporosis, aren't they?
[00:11:31] Speaker B: Yes. So by just looking for people whose bones were weaker than predicted, they successfully flushed out the individuals carrying these rare disease mutations.
[00:11:40] Speaker A: Here's where it gets really interesting.
How does this framework apply to complex diseases, you know, conditions you either have or don't have, like type 2 diabetes?
[00:11:49] Speaker B: The case control results are fascinating. They completely prove the liability threshold model we talked about earlier.
[00:11:55] Speaker A: How so?
[00:11:56] Speaker B: Let's look at type 2 diabetes. The researchers found patients who definitely have diabetes, but who also possess a very low polygenic risk score.
[00:12:06] Speaker A: So based on their common genes, their liability cup should be nearly empty. They shouldn't be sick.
[00:12:11] Speaker B: Right. But they were significantly enriched for rare, highly pathogenic variants in MODY genes. Specifically HNF1a and HNF4aMODY stands for maturity
[00:12:23] Speaker A: onset diabetes of the young, right?
[00:12:25] Speaker B: That's right. These genes control the beta cells in the pancreas that secrete insulin. So if you get a severe mutation there, your baseline genetic risk doesn't matter. Your rare variant just overwhelmed your good common genetics.
[00:12:37] Speaker A: That is the ultimate unexpected medical bill. Ruining the budget.
[00:12:40] Speaker B: Exactly.
[00:12:41] Speaker A: But what about the other side? The unexpected lottery wins. Did they find people whose baseline budget was terrible, but they were perfectly healthy?
[00:12:48] Speaker B: They did. In coronary artery disease, or cad, they found healthy control subjects, people with perfectly clear arteries who possessed a dangerously high polygenic risk score.
[00:12:58] Speaker A: So they absolutely should have had heart disease.
[00:13:00] Speaker B: Yes, but they were enriched for rare protective variants in the Angie PTL3 gene.
[00:13:06] Speaker A: So essentially, their rare variants acted as a shield, protecting them from their own bad common genetics.
[00:13:11] Speaker B: Exactly. It's an incredible protective mechanism.
[00:13:14] Speaker A: And then they moved on to the third stage, Right? The unbiased exome wide scan.
The total unknowns.
[00:13:20] Speaker B: Yes. The scan identified 74 significant genes across the genome.
[00:13:26] Speaker A: What stood out there?
[00:13:27] Speaker B: Well, for bmi, they found that lower than expected BMI was tightly linked to the NPL gene.
[00:13:33] Speaker A: And NPL is super interesting because its biological function spans across species, right?
[00:13:37] Speaker B: Yes. The Paper notes that a deficiency in this specific gene causes muscle loss, not just in humans, but in zebrafish and mice as well, which is wild.
[00:13:46] Speaker A: Seeing that mechanism conserved across such vastly different species gives us so much confidence that this is a true biological function.
[00:13:53] Speaker B: Absolutely. They also linked lower than expected bmi to the ACSL6 gene, which is involved in lipid synthesis.
[00:14:01] Speaker A: Okay, and what about the other traits in the exomes scan?
[00:14:04] Speaker B: They looked at age at menopause, too. Higher than expected. Age at menopause was linked to damaging missense variants in the can K1 gene.
[00:14:12] Speaker A: Okay.
[00:14:12] Speaker B: Yeah. It offers some really tantalizing clues about estrogen dependent gene expression and the reproductive system.
[00:14:19] Speaker A: So, connecting all of this to the real world, we have this massive data set. We've identified the outliers, we found these rare variants. What are the actual implications for clinical practice?
[00:14:28] Speaker B: The most immediate application is triage. This misalignment framework proves that polygenic risk scores aren't just for predicting common diseases. Right. They can actually be used as a powerful triage tool to identify which patients should be prioritized for for expensive rare variant genetic screening.
[00:14:45] Speaker A: Because you can't just sequence everyone's exome. It's too expensive.
[00:14:48] Speaker B: Exactly. But if a patient is sick and their polygenic score says they shouldn't be, that misalignment is a massive red flag pointing toward a rare monogenic driver.
[00:14:56] Speaker A: And beyond triaging patients, I imagine this acts as a sort of treasure map for drug development.
[00:15:01] Speaker B: Oh, absolutely. It fundamentally changes target discovery. Think about the ACSL6 gene we just
[00:15:07] Speaker A: mentioned, the one linked to lower BMI.
[00:15:09] Speaker B: Yes.
Damaging that gene results in a lower than expected BMI because it impairs lipid synthesis. So theoretically, pharmaceutical companies could design an antibody to specifically target and inhibit ACSL
[00:15:23] Speaker A: 6, basically using a drug to artificially recreate the exact protective effect of that rare genetic mutation. To treat severe obesity.
[00:15:32] Speaker B: Precisely.
[00:15:33] Speaker A: That is incredible. But we are talking about finding needles in a genetic haystack here.
What are the limitations? What are the blind spots of this methodology?
[00:15:41] Speaker B: There are a few important caveats. The biggest one is simply statistical power. Because they're looking for ultra rare variants in a tiny, restricted group of misaligned people. The raw numbers are minuscule.
[00:15:52] Speaker A: How small are we talking?
[00:15:54] Speaker B: In many of the tests, literally zero. People in the misaligned group happen to carry the variant. It makes it mathematically very hard to prove associations.
[00:16:02] Speaker A: Yet double rarity leads to incredibly small samples sizes.
[00:16:05] Speaker B: Another limitation is that this is largely computational.
[00:16:08] Speaker A: Right. Algorithms rather than doctors.
[00:16:10] Speaker B: Exactly. The variants were categorized as damaging by computer prediction algorithms, not by manual clinical review of the patients. And while the algorithms are good, they aren't perfect.
[00:16:21] Speaker A: And finally, there has to be a limitation regarding global diversity, right? Because polygenic scores require so much baseline data.
[00:16:28] Speaker B: Yes, a major limitation. This study was restricted exclusively to individuals with European genetic ancestry because the background
[00:16:36] Speaker A: linkage patterns of DNA differ between populations, right?
[00:16:40] Speaker B: Creating accurate polygenic scores requires massive ancestry specific data pools, and those are currently lacking for non European populations. So until we have diverse global data, we can't easily apply this misalignment classification to everyone.
[00:16:55] Speaker A: Well, if we distill this entire study down, what is the central insight here?
[00:17:00] Speaker B: Ultimately, the core insight is that phenotypic misalignment, when a patient's real world traits defy their common variant genetic predictions, is a powerful biological signal.
[00:17:10] Speaker A: We can't just ignore the outliers anymore.
[00:17:12] Speaker B: No, we have to investigate them. By studying these genetic rebels, we can validate the complex liability of disease, improve diagnostic screening for rare monogenic disorders, and discover entirely new therapeutic targets.
[00:17:26] Speaker A: What does this mean for the future of personalized medicine? When your doctor looks at your chart, will they treat the genes you have or the genetic expectations you've managed to defy?
This episode was based on an Open Access article under the CC BY 4.0 license. You can find a direct link to the paper and the license in our episode Description. If you enjoyed this, follow or subscribe in your podcast app and leave a five star rating. If you'd like to support our work, use the donation link in the description Now. Stay with us for an original track created especially for this episode and inspired by the article you've just heard about. Thanks for listening and join us next time as we explore more science. Bass by bass.
[00:18:24] Speaker C: A score that said I shouldn't sink but life showed up with different math A hidden turn, a sideways path Some numbers hum, some numbers shout but there's a ghost they leave out A single letter out of place that changes speed that shifts the race I'm an outlier in the light Coming wins rest Sparks ignite When the forecast doesn't fit the sky look closer, there's a reason why hey Too short, too tall Bone turning thin A quiet break beneath the skin A switch that dims A switch that drives two kinds of cold inside our lives so when the curve won't hold me tight don't call it noise don't call it slight screen the shadows name the flame Find the gene that bends the game I'm an outlier in the light Common winds red sparks ignite and if the forecast misses true, the rare can move the common to look closer.
There's a reason why Sam.