Episode 153

September 30, 2025

00:15:13

153: Skeletal muscle eQTL meta-analysis implicates genes in the genetic architecture of muscular and cardiometabolic traits

Hosted by

Gustavo B Barra
153: Skeletal muscle eQTL meta-analysis implicates genes in the genetic architecture of muscular and cardiometabolic traits
Base by Base
153: Skeletal muscle eQTL meta-analysis implicates genes in the genetic architecture of muscular and cardiometabolic traits

Sep 30 2025 | 00:15:13

/

Show Notes

️ Episode 153: Skeletal muscle eQTL meta-analysis implicates genes in the genetic architecture of muscular and cardiometabolic traits

In this episode of PaperCast Base by Base, we explore a large skeletal muscle eQTL meta-analysis that integrates GTEx and FUSION data to pinpoint regulatory variants and genes underlying muscular and cardiometabolic traits.

Study Highlights:
Combining RNA-seq and whole-genome data from 1,002 individuals across two cohorts, the authors identified 18,818 conditionally distinct eQTL signals affecting 12,283 genes, with 35% of genes harboring multiple signals. Colocalization with 26 GWAS datasets yielded 2,252 signal pairs and nominated 1,342 candidate genes, and strikingly 22% of the colocalizations involved non‑primary eQTL signals while many mapped far from the nearest transcription start site. A focused multi‑tissue analysis for type 2 diabetes linked 309 of 862 tested signals to 551 genes across skeletal muscle, adipose, liver, and islet, representing 36% of T2D signals and exceeding the yield of any single tissue. The study also functionally validated a T2D‑linked variant at the INHBB locus, where the risk allele increased enhancer activity and aligned with higher gene expression in both muscle and adipose models.

Conclusion:
This work delivers a well‑powered skeletal muscle eQTL resource and shows how multi‑signal, multi‑tissue integration clarifies the molecular mechanisms and candidate targets underlying cardiometabolic disease.

Reference:
Wilson EP, Broadaway KA, Parsons VA, Vadlamudi S, Narisu N, Brotman SM, Currin KW, Stringham HM, Erdos MR, Welch R, Holtzman JK, Lakka TA, Laakso M, Tuomilehto J, Boehnke M, Koistinen HA, Collins FS, Parker SCJ, Scott LJ, Mohlke KL. Skeletal muscle eQTL meta-analysis implicates genes in the genetic architecture of muscular and cardiometabolic traits. The American Journal of Human Genetics. 2025;112:1–15. https://doi.org/10.1016/j.ajhg.2025.09.003

License:
This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/

Support:
If you'd like to support Base by Base, you can make a one-time or monthly donation here: https://basebybase.castos.com/

Chapters

  • (00:00:14) - Deep Dive into the genetic dark matter of diabetes
  • (00:03:15) - The muscle map of diabetes
  • (00:06:53) - EQTL and disease risk: combining the studies
  • (00:07:32) - The Near Gene Trap
  • (00:09:26) - What about those other dimmer switches you mentioned? The non primary
  • (00:10:05) - Exploring T2D in muscle, fat, and eye
  • (00:13:14) - Beyond the nearest gene heuristic
  • (00:13:54) - GWAS and the causal genetics of muscular dystrophy
View Full Transcript

Episode Transcript

[00:00:14] Speaker A: Welcome to Base by Base, the papercast that brings genomics to you wherever you are. So today we're diving into something that's, well, it's a huge puzzle in genomics, trying to make sense of the genetic dark matter. [00:00:27] Speaker B: Right. [00:00:27] Speaker A: You know, you're looking at a complex disease, let's say type 2 diabetes, TTD for short. You do a massive genome wide association study, a GWS and bam. You get thousands of spots in the genome linked to the disease. It feels like you've struck gold, you've got the map. [00:00:45] Speaker B: Yeah, it seems like it. But here's the kicker. Over 90%, maybe even more of those spots, those risk variants, they, they're not in the genes that code for proteins. [00:00:55] Speaker A: Exactly. They're in the regulatory bits, the non coding DNA. So they're not like breaking the protein, they're messing with the volume knob for the gene. They're turning it up or down precisely. [00:01:05] Speaker B: You know the location of the risk, but you often have no clue which gene it's actually tweaking or how. Yeah, that's the gap, right, between just finding an association and knowing the actual cause. [00:01:14] Speaker A: And for years the easy way out was just, well, assume it controls the gene right next door. [00:01:18] Speaker B: The nearest neighbor approach. [00:01:19] Speaker A: Yeah, but if that assumption's wrong, and we'll see just how wrong it can be, that maybe the drug targets we're chasing are built on, well, shaky ground. [00:01:28] Speaker B: Absolutely. So the real challenge is making that connection, linking the non coding variant to its actual target gene. Especially when that gene might be surprisingly far away on the chromosome. Miles away, genomically speaking. And that's what this deep dive is all about. It presents this incredibly detailed, high resolution map for skeletal muscle which is just crucial for metabolic health. It's designed to tackle exactly that problem. The distance issue. [00:01:55] Speaker A: Right. [00:01:56] Speaker B: But before we jump into the nitty gritty, we really should give credit where it's due. The scale of this work is impressive, definitely. So yeah, today we celebrate the work of Emma P. Wilson, Kaylane Broadaway, Victoria Parsons and a whole big team led by Karenell Mohlke. They're from places like UNC Chapel Hill, the National Human Genome Research Institute and University of Michigan. Really top notch institutions. Their focus unraveling the genetic architecture of muscle traits and cardiometabolic diseases. [00:02:24] Speaker A: Fantastic work. Okay, so for listeners maybe newer to this, let's quickly break down the main tools again. GWA finds the rough neighborhood of the risk. [00:02:32] Speaker B: Right, the general postcode, but we need. [00:02:34] Speaker A: Something more precise to see which house, which gene is involved and how its lights Are being switched on or off. That's where EQTLs come in, right? [00:02:41] Speaker B: Yeah. [00:02:41] Speaker A: Expression Quantitative trait loci. [00:02:43] Speaker B: Exactly. An eqtl is basically a genetic variant, A difference in DNA sequence that lines up with a measurable change in how much a specific gene is expressed, how active it is. Okay, so when you find that the same variant seems to influence both a clinical trait, like your risk for T2D, and the expression level of a particular gene, that's what we call colloquialization. [00:03:05] Speaker A: Gotcha. [00:03:06] Speaker B: And that, colloquialization, that overlap, it's like a strong hint. It really points towards that gene being potentially causal. For the traitor, it nominates it as a prime suspect. [00:03:14] Speaker A: Makes sense. Now, you mentioned complex traits like T2D involve lots of different tissues. Pancreas, liver, fat. So why the intense focus on skeletal muscle here? [00:03:24] Speaker B: That's a great question. Skeletal muscle, it's not just for, you know, flexing your biceps. It's a metabolic engine. [00:03:31] Speaker A: Right. [00:03:31] Speaker B: It handles a huge chunk of the glucose uptake from your blood, especially after you eat. And it's key for using fats. If your muscles aren't listening to insulin properly. [00:03:41] Speaker A: Insulin resistance. [00:03:42] Speaker B: Exactly. Insulin resistance in muscle. Then your whole system gets out of whack. It's a direct road to problems like obesity, diabetes, even cardiovascular disease. Muscle's response to insulin is probably the single biggest factor determining your overall insulin sensitivity. It's central. [00:04:00] Speaker A: Okay, central, but not the only player, as you said. The multi tissue challenge. [00:04:03] Speaker B: Absolutely not the only player. TTD is complicated, right? It involves the pancreatic islets, maybe not making enough insulin, the liver putting out too much glucose, fat tissue acting up. [00:04:13] Speaker A: The whole system failure. [00:04:14] Speaker B: It really is. And that's why you need to look at these tissues together. Mapping EQTLs in muscle is powerful, but integrating that with data from adipose tissue, liver and eyelets, that gives you the systemic view, the full picture, not just one piece of the puzzle. [00:04:27] Speaker A: Okay, so how did they build this incredibly detailed muscle map? What was the approach? [00:04:33] Speaker B: Well, they started by doing a massive meta analysis. They basically pooled together skeletal muscle eqtl data from two really important independent studies. [00:04:42] Speaker A: Ah, combining forces. [00:04:44] Speaker B: Yep. They used data from the fusion study and the GTEx project altogether. That gave them genetic and gene Expression data from 1002 individuals. [00:04:55] Speaker A: Wow. Over a thousand people. From muscle tissue specifically. That's substantial. [00:04:59] Speaker B: It really is. And they crunched the numbers on. What was it? Over 24,000 genes. And? And nearly 8 million genetic variants. Huge scale. [00:05:07] Speaker A: But it wasn't just about the scale. Was It. You mentioned something about how they analyzed it being innovative. [00:05:11] Speaker B: Yes, exactly. The key tool they used is called apex. It's a statistical package. And what's special about APEX is that it uses the individual level genetic data, not just summary stats. This allows for something called conditional analysis. [00:05:23] Speaker A: Conditional analysis. Sounds complex. Why is that so crucial? What does it let you find that you'd otherwise miss? [00:05:29] Speaker B: Okay, let's use an analogy. Think of a gene's expression level, how active it is. Like the brightness of a light bulb controlled by dimmer switches. Most simpler eqtl studies find the main, most obvious dimmer switch, the one that has the biggest effect on brightness. That's the primary signal. [00:05:48] Speaker A: Okay, the master switch, kind of. [00:05:50] Speaker B: But what if there are other independent dimmer switches wired to that same light bulb? Maybe smaller ones, having subtler effects, but still controlling the light. [00:05:59] Speaker A: Ah, I see. [00:06:00] Speaker B: Conditional analysis lets you find those additional independent signals. For a single gene, it can identify multiple distinct genetic variants, each acting like its own separate switch, influencing that gene's expression. It teases apart these different regulatory layers. [00:06:15] Speaker A: So it's not just finding the loudest signal, it's finding all the different controls. [00:06:19] Speaker B: Precisely. All the different volume knobs. Going back to that analogy, once they had this really rich map of all these distinct EQTL signals in muscle, they then systematically compared them, colloqualized them with GWO's, results for 26 different muscular and cardiometabolic traits. And then the final step. For one really interesting signal, they did the functional validation. They used lab tests, reporter assays in actual human muscle cells, myoblasts and fat cells, adipocytes, to check if the variant really did change how the gene was controlled. [00:06:51] Speaker A: Closing the loop. That's great. So what were the big headline numbers from this? Did combining the studies and using this conditional analysis really pay off? [00:07:00] Speaker B: Oh, absolutely. The numbers are pretty staggering. They pinpointed 18,818 of these conditionally distinct signals. [00:07:07] Speaker A: 18,000? [00:07:08] Speaker B: Yeah. Spread across more than 12,000 genes. And combining fusion and GTEx was key. They found something like 20 to 35% more signals than either study could find on its own. Wow. And the CO localization part that linked these EQTLs to disease risk, they ended up nominating 1,342 potential candidate genes through over 2200 GW ECQTL connections. It's a huge new resource. [00:07:32] Speaker A: Okay, but here's the finding that really jumped out at me. The whole nearest gene idea. How often was that actually correct? [00:07:37] Speaker B: Yeah, this is the big one. The confirmation of the nearest Gene trap. Brace yourself. They found that only 37% of the time did the disease signal actually co localize with the EQTL for the physically closest protein coding gene. [00:07:52] Speaker A: 37%. Wait, so that means nearly 2/3 of the time, 63%. If you just assumed the nearest gene was the culprit, which was standard practice, you were wrong. [00:08:03] Speaker B: You were likely barking up the wrong tree. Yeah, 63% of the time. It fundamentally changes how we need to interpret GWS results. [00:08:11] Speaker A: That's huge. [00:08:12] Speaker B: And it gets even more striking. They found that almost half, 44% of these regulatory links involve genes whose start site TSS, was more than 50,000 base pairs away from the GWS signal. That's a long way, genomically speaking, 50 kilobases. [00:08:26] Speaker A: Can you give a concrete example? It helps to picture how misleading proximity can be. [00:08:30] Speaker B: Sure, there was a specific T2D risk signal. The actual variants were sitting near two genes, Carmel 3 and Cpne 6. They were right there. The closest ones. Obvious candidates. [00:08:40] Speaker A: You'd think, okay, the prime suspects. Based on location. [00:08:43] Speaker B: Exactly. But when they did the conditional EQTL analysis and the colocalization, that GWA signal didn't link to Caramel3 or CPNE6 at all. It exclusively collocalized with the muscle EQTL for a different gene, PCK2. [00:08:58] Speaker A: PCK2. And where was that? [00:08:59] Speaker B: It was the fourth closest gene, about 36 kilobases away. And the analysis showed the T2D risk variant was strongly tied to increased expression of PCK2, specifically in muscle. [00:09:10] Speaker A: Wow. So the address led you to one street. But the actual effect was happening several blocks over. [00:09:15] Speaker B: A perfect analogy. It gives you the real candidate, PCK2 and. And the likely mechanism increased expression. Completely overturning the simple proximity guess. [00:09:24] Speaker A: Okay, so proximity is often a distraction. What about those other dimmer switches you mentioned? The non primary signals found by conditional analysis? Did they turn out to be important for linking to diseases? [00:09:34] Speaker B: Critically important. They found that 22%, almost a quarter of the validated GAE QTL links involved one of these secondary or tertiary EQTL signals, not the main, strongest one for that gene. [00:09:46] Speaker A: So nearly one in four connections would have been missed if they'd only looked for the biggest effect. [00:09:51] Speaker B: Precisely. It tells you that the regulatory variant that's most important for disease risk isn't always the one with the biggest overall impact on gene expression. Sometimes it's these subtler independent controls that matter most for the disease link. [00:10:04] Speaker A: That's a really crucial insight. Okay, let's zoom back out to the T2D context and the multi tissue angle, did looking across muscle, fat, liver and eyelets pay off for understanding T2D massively? [00:10:17] Speaker B: By applying this colloqualization approach across all four tissues for T2D GWAS signals, they managed to link 309 distinct T2D risk signals to a total of 551 candidate. [00:10:29] Speaker A: Genes, 551 potential genes involved. And how did that compare to looking at just one tissue? [00:10:34] Speaker B: It was way better. They found over 50% more TTD signals with candidate genes than they would have using any single tissue on its own. It really underscores that for complex diseases like ttd, you absolutely need that integrated multi tissue view. [00:10:46] Speaker A: Makes total sense. And did the deep muscle map contribute unique findings within that things missed by other tissues? [00:10:52] Speaker B: It did. They found 95 candidate genes for T2D that were only identified through EQTLs in skeletal muscle. [00:11:00] Speaker A: 95 unique muscle links? [00:11:02] Speaker B: Yeah, genes like VEGFB and RNF10. These are now flagged as high priority candidates specifically for for a Muscle's role in T2D. Candidates that were basically hidden before this high resolution muscle map existed. [00:11:14] Speaker A: And you mentioned they followed up on one signal with lab work. The validation story. [00:11:18] Speaker B: Right, the INHB story. There's this T2D risk variant RS 11688682. What was neat is that this variant colloquialized with the eqtl for the gene inhb not just in muscle, but also in subcutaneous fat tissue. [00:11:32] Speaker A: Ah, so a shared effect in two key metabolic tissues. [00:11:36] Speaker B: Exactly. Suggesting a conserved mechanism. So he took that genomic region containing the variant into the lab using reporter assays, basically testing how well that piece of DNA can drive gene expression. They found the T2D risked version, the G allele acted as a stronger enhancer. [00:11:50] Speaker A: It ramped up activity significantly. [00:11:52] Speaker B: They saw about a 2.5-fold increase in transcriptional activity in muscle precursor cells and a two fold increase in fat cells compared to the non risk allele. [00:12:04] Speaker A: Wow. So the chain is complete. The risk variant boosts enhancer function. That drives up inhbb expression in muscle and fat. And that higher expression contributes to T2D risk. [00:12:16] Speaker B: That's the proposed mechanism, clearly laid out and functionally supported. Association localization validation. Beautiful science. [00:12:23] Speaker A: It really is. Now, no study's perfect. Did the researchers mention any limitations to this work, despite how powerful it is? [00:12:29] Speaker B: They did, quite transparently. Two main things. First, the eqtl data came from bulk. [00:12:35] Speaker A: Muscle tissue, meaning a mix of cell types. [00:12:37] Speaker B: Right. You've got muscle fibers, but also fibroblasts immune cells, fat cells within the muscle. This study averages the signal across all of them. It can't quite tell you if a regulatory effect is happening specifically in the muscle fibers versus say the connective tissue. Cell type specific mapping is the next frontier there. [00:12:54] Speaker A: Okay? Resolution could be even higher. [00:12:56] Speaker B: And second, while 1000 samples is great, they note that five for really complex regulatory networks, especially teasing apart subtle tissue specific effects versus shared effects. Even larger eqtl studies will be needed in the future. More data always helps refine the picture makes sense. [00:13:14] Speaker A: So wrapping this up, what's the big take home message for someone listening, whether they're a researcher or just interested in genomics? [00:13:21] Speaker B: I think the absolute key takeaway is stop relying on the nearest gene heuristic for complex traits, especially when dealing with non coding GWA signals. It's demonstrably wrong most of the time. Like 63% wrong in this context, yeah. [00:13:35] Speaker A: That number is just striking. [00:13:37] Speaker B: The path forward has to be this more comprehensive approach, finding those multiple distinct EQTL signals using methods like conditional analysis and crucially integrating data across all the relevant tissues for the disease you're studying. [00:13:49] Speaker A: Right? One tissue often isn't enough, and the closest gene is often a red herring. Which leads us perfectly into that final provocative thought. We now have solid proof that regulatory elements in muscle can control genes 50, maybe even hundreds of kilobases away. If we apply that realization that distance is deceptive rule to other complex diseases, Think neurological conditions like schizophrenia or Alzheimer's, where most GWAS hits are also non coding. Are we potentially looking way too close to the G2S signals? Could the truly causal genes for those devastating diseases be lurking much further down the chromosome, completely missed by our current assumptions? Are we maybe targeting the wrong proteins entirely? [00:14:34] Speaker B: That's the multi million dollar question, isn't it? This work really forces us to reconsider. [00:14:38] Speaker A: This episode was based on an Open Access article under the CC BY 4.0 license. You can find a direct link to the paper and the license in our episode description if you enjoyed this, follow or subscribe in your podcast app and leave a five star rating. If you'd like to support our work, use the donation link in the description. Thanks for listening and join us next time as we explore more science base by base.

Other Episodes