Episode 19

May 14, 2025

00:18:18

️ 19: Systematic Identification of Promoter and UTR Variants in Rare Disease Diagnostics

Hosted by

Gustavo B Barra
️ 19: Systematic Identification of Promoter and UTR Variants in Rare Disease Diagnostics
Base by Base
️ 19: Systematic Identification of Promoter and UTR Variants in Rare Disease Diagnostics

May 14 2025 | 00:18:18

/

Show Notes

Episode 19: Systematic Identification of Promoter and UTR Variants in Rare Disease Diagnostics

In this episode of Base by Base, we delve into Martin-Geary et al.’s (2025) innovative framework for uncovering disease-causing variants within promoters and untranslated regions (UTRs) in individuals with rare disorders. Leveraging genome sequencing data from 8,040 undiagnosed trios in the Genomics England 100,000 Genomes Project, the authors combine precise region definitions based on MANE transcripts and ENCODE candidate cis-regulatory elements with stringent filtering and annotation tools—including VEP with UTRannotator, SpliceAI, CADD, PhyloP, and FABIAN—to prioritize de novo non-coding variants with high likelihood of pathogenicity .

Key Highlights:
The study defines over 20 million bases of proximal promoter and UTR sequence across 1,567 dominant disease genes and excludes coding regions to focus on regulatory elements; it then filters de novo variants by allele frequency and region overlap, yielding 1,311 candidates prior to annotation . Utilizing annotation thresholds calibrated for non-coding contexts, the pipeline prioritizes eleven de novo variants, nine of which match patients’ phenotypes and include both previously confirmed diagnoses (e.g., PAX6, MEF2C) and novel findings (e.g., SLC2A1, NIPBL, ZBTB18, SETD5, GNAS). Clinical review and functional follow-up, such as RNA-seq and DNA methylation episignatures, validate the diagnostic potential of these variants. Applying the same filters to ClinVar confirms high specificity—prioritizing 53.7% of known pathogenic variants while excluding 99.3% of benign ones—though sensitivity for promoter variants remains an area for improvement. Finally, a burden test across 7,862 probands matched to controls shows no significant enrichment of prioritized promoter or UTR variants, underscoring challenges in detecting aggregate non-coding variant effects at current cohort sizes.

Conclusion:
Martin-Geary et al.’s framework demonstrates that routine interrogation of promoters and UTRs can yield actionable genetic diagnoses—albeit at modest incremental yield—and provides a reproducible, highly specific pipeline that can be integrated into clinical diagnostic workflows. As our understanding of regulatory genomics deepens and annotation tools improve, such approaches will become increasingly powerful for uncovering hidden causes of rare disease.

Reference:
Martin-Geary, A. C., Blakes, A. J. M., Dawes, R., et al. (2025). Systematic identification of disease-causing promoter and untranslated region variants in 8040 undiagnosed individuals with rare disease. Genome Medicine, 17, 40. https://doi.org/10.1186/s13073-025-01464-2.

License:
This episode is based on an open access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/

Other Episodes