Episode 213

November 29, 2025

00:20:29

213: BRAIN-MAGNET: a functional atlas for non-coding variants

Hosted by

Gustavo B Barra
213: BRAIN-MAGNET: a functional atlas for non-coding variants
Base by Base
213: BRAIN-MAGNET: a functional atlas for non-coding variants

Nov 29 2025 | 00:20:29

/

Show Notes

️ Episode 213: BRAIN-MAGNET: a functional atlas for non-coding variants


In this episode of PaperCast Base by Base, we explore BRAIN-MAGNET couples a ChIP-STARR-seq atlas of 148,198 neural regulatory elements with a validated convolutional neural network to predict enhancer activity and prioritize disease-relevant non-coding variants


Study Highlights:
The authors generated an activity-ranked functional genomics atlas of 148,198 non-coding regulatory elements in human neural stem cells. Comparative ChIP-STARR-seq revealed many elements are epigenetically primed in embryonic stem cells for later neural activity. BRAIN-MAGNET, a convolutional neural network trained on the atlas, predicts enhancer activity from DNA sequence and computes nucleotide-level contribution scores to identify functional motifs. The model outperformed other prioritization scores at tested loci and enabled prioritization and functional validation of both common GWAS SNPs and rare variants, including a putative RAB7A enhanceropathy.


Conclusion:
The NCRE atlas and BRAIN-MAGNET provide a functionally validated resource to interpret non-coding genetic variation relevant to neurodevelopment and neurological disease


Music:
Enjoy the music based on this article at the end of the episode.


Reference:
Deng R, Perenthaler E, Nikoncuk A, Yousefi S, Lanko K, Schot R, Maresca M, Medico-Salsench E, Sanderson LE, Parker MJ, van Ijcken WFJ, Park J, Sturm M, Haack TB, Roshchupkin GV, Mulugeta E, Barakat TS. BRAIN-MAGNET: A functional genomics atlas for interpretation of non-coding variants. Cell. 2026 Jan 22;189:1–20. https://doi.org/10.1016/j.cell.2025.10.029


License:
This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/


Support:
Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00

Official website https://basebybase.com

Castos player https://basebybase.castos.com

On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics.

Chapters

  • (00:00:00) - Finding the missing DNA in the brain
  • (00:05:24) - Brain Magnet: The regulatory map of the brain
  • (00:07:40) - Brain Magnet: The functional atlas
  • (00:10:46) - Brain Magnet for SNPs and the neuropsychiatric disorders
  • (00:12:06) - Brain Magnet for rare neurogenetic disorders
  • (00:16:42) - Neuroscience: Hidden Codes in the Silent Frame
View Full Transcript

Episode Transcript

[00:00:00] Speaker A: Foreign. Welcome to Base by Bass, the papercast that brings genomics to you wherever you are. Thanks for listening and don't forget to follow and rate us in your podcast app. So let's start this deep dive with a number that's. Well, it's pretty startling when you think about it. Did you know that 98% of your DNA doesn't actually code for proteins? [00:00:32] Speaker B: It's incredible. For decades we sort of called it. [00:00:34] Speaker A: Genetic dark matter or even junk DNA. But we know now that this massive region, this non coding part of the genome, it harbors most of the variations that are linked to human disease. [00:00:46] Speaker B: Right. [00:00:47] Speaker A: And if you think about neurodevelopmental disorders. So conditions that affect how the brain forms and wires itself up. [00:00:55] Speaker B: Yeah. [00:00:56] Speaker A: When clinicians are looking for a cause, they focus on those traditional protein coding genes, standard practice. And they only find an answer about 30 to 50% of the time, which. [00:01:05] Speaker B: Is a huge diagnostic gap. [00:01:07] Speaker A: It's huge. It means for more than half of the individuals, the answer is just hidden. It's likely buried in that so called dark matter. So the real challenge here isn't just finding these genetic variants, it's about understanding the, I guess, the regulatory grammar of it all. How can we possibly shine a light on these sequences to find the missing answers, especially for something as complex as the brain? Well, the research we're diving into today offers a really stunning two part answer. First, they built a massive atlas of experimentally measured function. [00:01:40] Speaker B: And second, they use that atlas to train a cutting edge AI to pinpoint the exact sequence change that could be responsible for a disease. [00:01:49] Speaker A: It really feels like a transition point in genomics. [00:01:51] Speaker B: It is. We're moving from just mapping the genome to know functionally decoding it letter by letter. [00:01:56] Speaker A: Absolutely. And before we get into the nitty gritty of how they did this, we really want to give a special recognition here. [00:02:02] Speaker B: Yes. [00:02:02] Speaker A: Today we celebrate the work of Ruizi Deng, Alena Parentaler, Tassin, Stefan Barakat and all of their colleagues from Erasmus MC and the University of Tubingen. [00:02:12] Speaker B: Their work has just profoundly advanced our understanding of functional genomics in neural disease. It's a fantastic paper. [00:02:19] Speaker A: It really is. [00:02:20] Speaker B: So to really get why this work was so necessary, we need to detail the core problem, which is functional interpretation. [00:02:26] Speaker A: Right. [00:02:27] Speaker B: We have these studies, genome wide association studies, or dwas, and they're great. They're fantastic at finding thousands of statistical links between SNPs, single nucleotide polymorphisms and. [00:02:39] Speaker A: Different traits like height or diabetes risk or. [00:02:43] Speaker B: Or neurological function. Exactly. But the problem is that the vast majority of those SMTs, they land in these non coding regions. [00:02:50] Speaker A: So you have the link, the associ association. But you don't have the why. [00:02:54] Speaker B: You don't have the why. The functional consequence is totally unclear. [00:02:57] Speaker A: It's like GWA shows you a hundred different flags flying near a certain gene, but you have no idea which one is actually controlling that gene's activity. [00:03:05] Speaker B: That's a perfect analogy. And we're specifically interested in what are called non coding regulatory elements or NCREs. Think of them as enhancers. [00:03:13] Speaker A: Remote control switches for genes. [00:03:15] Speaker B: Exactly. They can boost gene transcription from huge distances away. And we know that when these switches break, it can cause diseases we call enhance neuropathies. [00:03:24] Speaker A: But we don't screen for them clinically. [00:03:26] Speaker B: We don't. The data is just overwhelming and we haven't had the right tools to reliably interpret what a variant in one of these regions actually does. [00:03:35] Speaker A: Now, a lot of our listeners will have heard of big projects like ENCODE or the epigenome Roadmap. Why weren't those data sets enough to solve this? [00:03:45] Speaker B: That's a great question. Those projects were absolutely foundational critical. But they often identified putative NCREs. [00:03:53] Speaker A: Putative meaning supposed. [00:03:54] Speaker B: Right. They were based on biochemical markers. So they look for specific histone modifications that decorate the DNA. [00:04:01] Speaker A: Like H3K27AC for an active enhancer. A switch that's on. [00:04:05] Speaker B: Exactly. Or H3K4ME1 for a switch that's poised or ready. But here's the key distinction. [00:04:11] Speaker A: Okay. [00:04:12] Speaker B: Finding that histone mark proves the potential for function. It tells you the state of the. [00:04:16] Speaker A: Chromatin, but not the function itself. [00:04:18] Speaker B: Precisely. It doesn't actually measure the ability of that sequence to boost transcription. And for clinical interpretation, and especially for training a good AI model, you need direct, measurable, quantitative proof of activity that makes total sense. [00:04:31] Speaker A: You can't train an AI on a. Maybe you can't. [00:04:34] Speaker B: You need a verified yes or no. [00:04:36] Speaker A: Okay, so let's unpack the core methodology that built that verified atlas. The researchers used a technique with a, well, a ridiculously long name. [00:04:46] Speaker B: Right. It is a mouthful. [00:04:48] Speaker A: Chromatin immunoprecipitation coupled to self transcribing active regulatory region sequencing, which we thankfully can. [00:04:56] Speaker B: Just call Chip Starsec. [00:04:58] Speaker A: Yes, thank you. Let's break that down a bit. Sure. [00:05:00] Speaker B: You can think of it as a massive functional screening system. It's a type of massively parallel reporter assay. [00:05:07] Speaker A: Okay. [00:05:07] Speaker B: The real innovation here is that they can take nearly 150,000 unique regulatory elements all at the same time, clone each one into a separate reporter plasmid, a little circle of DNA, and then put them all into cells. [00:05:20] Speaker A: So each tiny circle of DNA is carrying one enhancer that they want to test. [00:05:24] Speaker B: Exactly. And if that cloned enhancer is active, it boosts the transcription of a reporter gene, in this case something like gfp, which makes the cell light up. Ah. [00:05:32] Speaker A: So you can literally see the activity. [00:05:34] Speaker B: You can measure it by measuring the amount of light. They get a direct quantifiable readout of just how powerful that enhancer is. They're measuring its native boost strength. [00:05:44] Speaker A: And they did this huge screen in neural stem cells, or NSCs. Why was that cell type so important for this? [00:05:51] Speaker B: Because NSCs are the source. They're the foundational cells that give rise to the entire central nervous system, the. [00:05:57] Speaker A: Building blocks of the brain. [00:05:58] Speaker B: Right. So they're the most relevant model for understanding how the brain develops and by extension, what goes wrong in neurodevelopmental disorders, or NDDs. By focusing the screen here, they created an atlas specifically tuned to the developing brain's regulatory landscape. [00:06:14] Speaker A: And the result was this incredible dataset, a functional atlas of nearly 150,000 NCREs, each one ranked by its measured activity in these NSCs. That's the foundation. [00:06:25] Speaker B: And that functionally validated map is what made the next step possible. [00:06:28] Speaker A: This is where it gets, I think, truly transformative. They use this atlas as training data to build an AI tool they named Brain Magnet. [00:06:36] Speaker B: It's a great name. [00:06:37] Speaker A: It is the brain focused artificial intelligence method to analyze genomes for non coding regulatory element mutation targets. [00:06:46] Speaker B: The AI, which is a convolutional neural network. It essentially learned the precise regulatory language of the NSC genome. It was trained to look at the raw DNA sequence, just the As, T, Cs and GS, and predict how active that enhancer will be. [00:07:02] Speaker A: But it's the output that really matters for diagnostics, isn't it? It's not just a general prediction of activity. [00:07:08] Speaker B: No. And this is crucial. Brain Magnet doesn't just give you a single activity score for the whole element. It calculates what they call contribution scores or Kiev scores for every single nucleotide within that sequence. [00:07:20] Speaker A: Wow. [00:07:21] Speaker B: So think of it like a long legal contract. Most of the words are just context. The cleave score is like an algorithm that points to the single most important word, the one functional hotspot that if you change it, the whole agreement breaks. [00:07:33] Speaker A: So it gives you that pinpoint accuracy to look at one specific genetic variant and predict its functional Impact with very high confidence. [00:07:40] Speaker B: Yes. [00:07:40] Speaker A: So let's dive into some of the discoveries they made just by analyzing this functional atlas. What did they find out about the genes that are regulated by the most active NCREs? [00:07:50] Speaker B: They confirmed a really fundamental principle of developmental biology. [00:07:54] Speaker A: Okay. [00:07:55] Speaker B: The genes that were regulated by these highly active ncrest, they showed significantly higher expression levels and maybe more importantly, they were much more intolerant to loss of function mutations. [00:08:07] Speaker A: Could you just quickly define that loss of function intolerance score for us? The PLI score. [00:08:11] Speaker B: Of course. The PLI score is a statistical measure. It tells you how often a gene is seen to be broken or mutated in the general population compared to what you'd expect by chance. [00:08:21] Speaker A: So a high score means it's a really important gene. [00:08:23] Speaker B: A very important gene. A high PLI score means you rarely see that gene broken in healthy people. Which implies that breaking even one copy is probably lethal or leads to a very severe developmental problem. [00:08:35] Speaker A: And that correlation confirmed that These highly active NCREs are controlling the most critical genes for neural development. [00:08:41] Speaker B: It did. They also used this really clever comparative approach looking at the same elements in both neural stem cells and embryonic stem cells. [00:08:50] Speaker A: The comparative chip star sec. [00:08:51] Speaker B: Right. And this led them to identify something called primed enhancers. [00:08:55] Speaker A: Primed enhancers. [00:08:56] Speaker B: But they run the functional screen in both cell types. They found sequences that were very active in the embryonic stem cells. They lit up the reporter. Yet when they looked at the chromatin of those same ESCs, those sequences didn't have the typical active histone mark, the H3K27XE. Instead they had H3K4 methylation. [00:09:18] Speaker A: So wait, the functional test showed the switch was on, but the epigenetic marks showed it was only ready? [00:09:23] Speaker B: Exactly. That epigenetic signature is the very definition of a primed enhancer. It's like the cell puts a bookmark there epigenetically mark for later activations. [00:09:31] Speaker A: So it's planning ahead. [00:09:33] Speaker B: It's planning ahead. As those cells differentiated into neurons and astrocytes, those primed NCREs gained the H3K27 mark and became fully active. It's beautiful evidence that the developmental timing for building a brain is written into the non coding genome at a very, very early stage. [00:09:52] Speaker A: That's a huge insight into how the genome pre plans its own development. But what about the real test for the AI? Did they prove that brain magnet specific prediction scores actually matter biologically? [00:10:03] Speaker B: They did. With some really robust validation experiments. They took some of these NCREs and made targeted edits. They deleted a tiny 30 base pair segment. Right on the spot where Brain Magnet gave a high CB score. And what happened in 16 out of 17 cases? That small, precise deletion just tanked the NCRE's activity. [00:10:21] Speaker A: Wow. And what if they deleted a region the AI said was unimportant? [00:10:25] Speaker B: No effect. Deleting regions with low CB scores had no functional effect at all. [00:10:29] Speaker A: That is the compelling evidence. [00:10:30] Speaker B: It is. It confirms Brain Magnet can successfully pinpoint the single functionally critical letters inside these long, complex regulatory sequences. We're moving from just statistical association to a causal mechanism. [00:10:46] Speaker A: So let's talk about the clinical implications, starting with fine mapping common diseases from GWA's data. [00:10:52] Speaker B: Right, so with the GWAS result, you often get a whole cluster of SNPs that are inherited together. It's called linkage disequilibrium, and it makes it really hard to tease them apart. But applying Brain Magnet to large sets of known functional SNPs from neuropsychiatric disorders, it just showed its superiority. [00:11:09] Speaker A: How so? [00:11:10] Speaker B: The key here is comparison. Brain Magnet's KEEP scores were significantly higher for variants that were already known to affect NCR reactivity compared to nearby variants that were non functional. [00:11:21] Speaker A: And other tools couldn't do that when. [00:11:22] Speaker B: They benchmarked it against other common scoring methods. Tools like cad, linsight and former. Those other tools often fail to distinguish between the functional and the non functional SNPs at the same locus. [00:11:33] Speaker A: So BrainMagnet cuts through the statistical noise by adding direct functional insight that those other methods lack. Do you have an example? [00:11:41] Speaker B: I do. At a schizophrenia associated region on chromosome 6, there were tons of associated SMP. Standard methods flagged a bunch of them. But Brain Magnet gave the single highest predictive score to a SMP called RS2483. And this was the exact variant that other labs had already experimentally validated as being the functional cause. [00:12:02] Speaker A: That's incredible. It gives researchers an immediate high confidence target. [00:12:06] Speaker B: It does. [00:12:07] Speaker A: Which brings us to the second and I think most exciting application, solving rare neurogenetic disorders. Finding these enhanced neuropathies. [00:12:15] Speaker B: Exactly. The team screened data from the Genomics England 100,000 Genomes Project. They were looking for rare variants that landed on these high CB score motifs within NCREs that were already linked to known disease genes. [00:12:29] Speaker A: And they found something. [00:12:29] Speaker B: They found a fascinating case. It involved a rare neurodevelopmental disorder and the gene RAB7A. [00:12:35] Speaker A: I know RB7A. It's linked to Charcot Marie tooth disease, right? [00:12:38] Speaker B: Yeah. [00:12:38] Speaker A: A peripheral nerve disorder. [00:12:39] Speaker B: That's the one. But in this case, they found a heterozygous rare variant in an NCRE, but it was 45,000 bases upstream of the genesis. [00:12:49] Speaker A: And it was right in a functional hotspot that brain magnet identified. [00:12:52] Speaker B: Right on the money. And their experiments confirmed it. The variant disrupted a key binding site for the transcription factor yy1 and significantly reduced the NCRE's activity in a dish. [00:13:04] Speaker A: Okay, so the variant functionally broke the switch. Did they show this had a real consequence in a living organism? [00:13:09] Speaker B: They did. They used zebrafish, which are a great model for vertebrate development. When they introduced the patient specific NCRE mutant, it resulted in reduced expression of a reporter gene, specifically in the central nervous system. [00:13:21] Speaker A: Wow. [00:13:22] Speaker B: It provided this powerful mechanistic genetic diagnosis based entirely on a non coding variant that had been completely invisible before. A clear cut case of an enhanced neuropathy. [00:13:32] Speaker A: This really does sound like a game changer for diagnostics. But no tool is perfect. Yeah. What were some of the limitations the researchers highlighted? [00:13:39] Speaker B: There are a couple of main hurdles. The first one is technical. The NCRE activity is measured episomally, meaning. [00:13:46] Speaker A: Outside its normal spot in the chromosome in that little reporter plasmid. [00:13:49] Speaker B: Exactly. And while they showed that a lot of their findings do hold up endogenously, that episomal context can sometimes miss the complexity of the native chromatin environment. You know, loops, boundaries, other things that can tweak activity in a real cell. [00:14:05] Speaker A: So you're measuring the enhancer's power in a vacuum, which is useful, but the cell might have other volume knogs installed in its natural habitat. [00:14:13] Speaker B: That's a great way to put it. And the second major hurdle is the clinical interpretation gap. Brain Magnet is incredibly accurate at predicting the impact on NCRE function, but it doesn't directly predict the clinical phenotype, the actual symptoms a patient will have. [00:14:29] Speaker A: And why is that distinction so hard to make? [00:14:31] Speaker B: Well, think about it. Delighting a whole gene might cause one severe, well known syndrome. [00:14:36] Speaker A: Right. [00:14:36] Speaker B: But just disrupting a single enhancer that controls that gene might only cause a partial effect. Or maybe an effect in only one tissue. It could even cause a completely new disorder. [00:14:46] Speaker A: So the tool gives us the target, but the clinical follow up, connecting that target to the symptoms is still a manual process. [00:14:54] Speaker B: It has to be charted one case at a time. [00:14:56] Speaker A: For now, that's vital context. [00:14:58] Speaker B: Yeah. [00:14:58] Speaker A: So let's bring it all together. What does this mean for us and for the future of precision medicine? [00:15:03] Speaker B: I mean, Brain Magnet is this powerful, functionally validated computational tool. It's built on a massive foundation of real experimental data from the right cell. [00:15:13] Speaker A: Type neural stem cells. [00:15:14] Speaker B: It moves us past simple statistical correlation. It lets us prioritize disease relevant non coding variants with high confidence. It really is like a magnet for finding those critical regulatory needles in the haystack of our genome, needles that have. [00:15:30] Speaker A: Been totally inaccessible until now. This whole deep dive really makes you rethink how we classify genetic disease. And it leaves me with a pretty big question. If NCRE dysfunction can cause diseases that only partially overlap with a gene's known syndrome, how many currently distinct separate syndromes diagnosed decades ago are actually just variations of the same underlying enhanceropathy, differentiated only by which enhancer is broken? And when it breaks during development, we might be on the verge of radically reorganizing our medical textbooks. [00:16:01] Speaker B: This episode was based on an Open Access article under the CC BY 4.0 license. You can find a direct link to the paper and the license in our episode description. If you enjoyed this, follow or subscribe in your podcast app and leave a five star rating. If you'd like to support our work, use the donation link in the description now. Stay with us for an original track created especially for this episode and inspired by the article you've just heard about. Thanks for listening and join us next time as we explore more science base by base. [00:16:42] Speaker C: Pages of code in the quiet spine 98% left between the lines Neural sparks in a silent sea Signals hiding in the scenery Glass plates under silver light Stem cells dreaming of a future mind Tiny scripts in a winding chain Pulling patterns from the static rain we turn noise into a mess Tracing echoes through the gaps Every base that leaves a trace Lights a path inside the brain. Longing for the silence you know Pulling secrets out of stone Reading where the signals flow in the dark between the codes when the wires misalign we can find a broken stone side magnets in the silent frame Calling every hidden name Chromatic doors that can open wide prime enhancers waiting on the T pulses rising as the marks appear Neural future's coming in too clear Common threads in a billion lives Rare mutations on a fragile line Some will whisper, some will sound which one tips the circuit out we turn noise into to a map Tracing ECH goes through the gaps Every base that leaves a trace Lights a path inside the brain. For the silence Pulling secrets out of stone Reading where the signals flow in the dark between the when the wires miss alight we can find the broken side magnets in the silent frame Call in every hidden name if a single letter falls far away from any walls till the network feels a tear in the axe and in the air we can watch the Atlas glow As the hidden currents show its small changes Change inside the thread Bends the future left or right instead. Magnet for the southern genome Pulling secrets out of stone. Magnets in the silent frame Calling every hidden name.

Other Episodes