Explore how the RNA-seq for rare diseases approach works

We are driven by science, and our product is rooted in rigorous research and innovation. Since 2014, the founding team has been dedicated to realizing the goal of leveraging RNA sequencing to transform rare disease diagnostics. After over a decade of research, development, and testing, we are proud to offer a product for research that stands out for its reliability and accuracy.

Measure the effect!

RNA sequencing allows to directly measure the effect of a variant on the transcript level. Identifying aberrant expression and splicing events can suggest the disease causing gene. This approach has boosted diagnostic yield by an average of15%, as shown by diverse research groups worldwide and a wide range of rare disorders.

RNA-seq for rare disease diagnostics, (c) OmicsDiscoveries GmbH
Genetic variants whose pathogenicity can be confirmed by RNA-seq, (c) OmicsDiscoveries GmbH

RNA-informed variant interpretation

Aberrant RNA events provide valuable insights into a wide range of variant types. Notably, variants of uncertain significance (VUS) – including intronic, synonymous, UTR, or structural variants – can be transformed into actionable findings. RNA analysis helps confirm the predicted effect of variants and identifies missed or overlooked variants that may not have been prioritized through DNA analysis.

Integrative approach to prioritize findings

By integrating aberrant gene expression, splicing results, DNA variants, and clinical data, we help geneticists narrow down the most relevant findings. Our award-winning prioritization algorithm, evaluated by the NIH-funded CAGI challenge, combines insights across multiple omics layers to score each variant.

Joint DNA, RNA and phenotype prioritization accelerates interpretation, (c) OmicsDiscoveries GmbH

The key studies and software

Key studies and software developed by our founding team that constitute the basis of our services.

drop
Detection of aberrant gene expression events in RNA sequencing data.

V. Yépez, C. Mertes, M. Müller, D. Klaproth-Andrade, L. Wachutka, L. Frésard, M. Gusic, I. Scheller, P. Goldberg, H. Prokisch, J. Gagneur
2021, Nat Protoc, 10.1038/s41596-020-00462-5

RNA sequencing (RNA-seq) has emerged as a powerful approach to discover disease-causing gene regulatory defects in individuals affected by genetically undiagnosed rare disorders. Pioneering studies have shown that RNA-seq could increase the diagnosis rates over DNA sequencing alone by 8–36%, depending on the disease entity and tissue probed. To accelerate adoption of RNA-seq by human genetics centers, detailed analysis protocols are now needed. We present a step-by-step protocol that details how to robustly detect aberrant expression levels, aberrant splicing and mono-allelic expression in RNA-seq data using dedicated statistical methods. We describe how to generate and assess quality control plots and interpret the analysis results. The protocol is based on the detection of RNA outliers pipeline (DROP), a modular computational workflow that integrates all the analysis steps, can leverage parallel computing infrastructures and generates browsable web page reports.

outrider
OUTRIDER: A statistical method for detecting aberrantly expressed genes in RNA sequencing data.

F. Brechtmann*, C. Mertes*, A. Matuseviciute*, V. Yépez, Z. Avsec, M. Herzog, D. Bader, H. Prokisch, J. Gagneur 2018, AJHG, 10.1016/j.ajhg.2018.10.025

RNA sequencing (RNA-seq) is gaining popularity as a complementary assay to genome sequencing for precisely identifying the molecular causes of rare disorders. A powerful approach is to identify aberrant gene expression levels as potential pathogenic events. However, existing methods for detecting aberrant read counts in RNA-seq data either lack assessments of statistical significance, so that establishing cutoffs is arbitrary, or rely on subjective manual corrections for confounders. Here, we describe OUTRIDER (Outlier in RNA-Seq Finder), an algorithm developed to address these issues. The algorithm uses an autoencoder to model read-count expectations according to the gene covariation resulting from technical, environmental, or common genetic variations. Given these expectations, the RNA-seq read counts are assumed to follow a negative binomial distribution with a gene-specific dispersion. Outliers are then identified as read counts that significantly deviate from this distribution. The model is automatically fitted to achieve the best recall of artificially corrupted data. Precision-recall analyses using simulated outlier read counts demonstrated the importance of controlling for covariation and significance-based thresholds. OUTRIDER is open source and includes functions for filtering out genes not expressed in a dataset, for identifying outlier samples with too many aberrantly expressed genes, and for detecting aberrant gene expression on the basis of false-discovery-rate-adjusted p values. Overall, OUTRIDER provides an end-to-end solution for identifying aberrantly expressed genes and is suitable for use by rare-disease diagnostic platforms.

fraser
Improved detection of aberrant splicing with FRASER 2.0 and the intron Jaccard index.

C. Mertes*, I. Scheller*, V. Yépez, M. Çelik, Y. Liang, L. Kremer, M. Gusic, H. Prokisch, J. Gagneur 2021, Nat Commun, 10.1038/s41467-020-20573-7

Aberrant splicing is a major cause of rare diseases. However, its prediction from genome sequence alone remains in most cases inconclusive. Recently, RNA sequencing has proven to be an effective complementary avenue to detect aberrant splicing. Here, we develop FRASER, an algorithm to detect aberrant splicing from RNA sequencing data. Unlike existing methods, FRASER captures not only alternative splicing but also intron retention events. This typically doubles the number of detected aberrant events and identified a pathogenic intron retention in MCOLN1 causing mucolipidosis. FRASER automatically controls for latent confounders, which are widespread and affect sensitivity substantially. Moreover, FRASER is based on a count distribution and multiple testing correction, thus reducing the number of calls by two orders of magnitude over commonly applied z score cutoffs, with a minor loss of sensitivity. Applying FRASER to rare disease diagnostics is demonstrated by reprioritizing a pathogenic aberrant exon truncation in TAZ from a published dataset. FRASER is easy to use and freely available.

Clinical implementation of RNA sequencing for Mendelian disease diagnostics.

V. A. Yépez*, M. Gusic*, R. Kopajtich, C. Mertes, N. Smith, ..., K. Murayama, T. Meitinger, J. Gagneur, H. Prokisch 2022, Genome Medicine, 10.1186/s13073-022-01019-9

Background
Lack of functional evidence hampers variant interpretation, leaving a large proportion of individuals with a suspected Mendelian disorder without genetic diagnosis after whole genome or whole exome sequencing (WES). Research studies advocate to further sequence transcriptomes to directly and systematically probe gene expression defects. However, collection of additional biopsies and establishment of lab workflows, analytical pipelines, and defined concepts in clinical interpretation of aberrant gene expression are still needed for adopting RNA sequencing (RNA-seq) in routine diagnostics.

Methods
We implemented an automated RNA-seq protocol and a computational workflow with which we analyzed skin fibroblasts of 303 individuals with a suspected mitochondrial disease that previously underwent WES. We also assessed through simulations how aberrant expression and mono-allelic expression tests depend on RNA-seq coverage.

Results
We detected on average 12,500 genes per sample including around 60% of all disease genes—a coverage substantially higher than with whole blood, supporting the use of skin biopsies. We prioritized genes demonstrating aberrant expression, aberrant splicing, or mono-allelic expression. The pipeline required less than 1 week from sample preparation to result reporting and provided a median of eight disease-associated genes per patient for inspection. A genetic diagnosis was established for 16% of the 205 WES-inconclusive cases. Detection of aberrant expression was a major contributor to diagnosis including instances of 50% reduction, which, together with mono-allelic expression, allowed for the diagnosis of dominant disorders caused by haploinsufficiency. Moreover, calling aberrant splicing and variants from RNA-seq data enabled detecting and validating splice-disrupting variants, of which the majority fell outside WES-covered regions.

Conclusion
Together, these results show that streamlined experimental and computational processes can accelerate the implementation of RNA-seq in routine diagnostics.

CAGI 6 challenge winner

Predicting molecular events underlying rare diseases using variant annotation, aberrant gene expression events, and human phenotype ontology.

V. A. Yépez, N. H. Smith, I. Scheller, J. Gagneur, C. Mertes, 2023 Research Square, 10.21203/rs.3.rs-3405211/v1

Rare genetic diseases often pose significant challenges for diagnosis. Over the past years, RNA sequencing and other omics modalities have emerged as complementary strategies to DNA sequencing to enhance diagnostic success. In the 6th round of the Critical Assessment of Genome Interpretation (CAGI), the SickKids clinical genomes and transcriptomes challenge aimed to evaluate the diagnostic potential of multi-omics approaches in identifying and resolving undiagnosed genetic disorders. Here, we present our participation in that challenge, where we leveraged genomic, transcriptomic, and clinical data from 79 children with diverse suspected Mendelian disorders to develop a model predicting the causal gene. We employed a machine learning model trained on a cohort of 93 solved mitochondrial disease samples to prioritize candidate genes. In our analysis of the SickKids cohort, we successfully prioritized the causal genes in 2 out of the 3 diagnosed individuals exhibiting abnormalities at the RNA-seq level and 6 cases out of the 12 where no effect on RNA was seen making our solution one of the winning ones. The challenge and our approach highlight the invaluable contributions of an integrative analysis of genetic, transcriptomic, and clinical data to pinpoint the disease-causing gene. The challenge was evaluated using three previously diagnosed individuals in which RNA-seq data proved helpful for diagnostics together with twelve individuals diagnosed solely through DNA analysis. Some of those cases were reported after the challenge by Deshwar et al. Our model was able to prioritize 2 out of the 3 RNA-seq supported cases on the top 3 ranks, while reaching a recall of over 50% under the top 100 genes across all 15 cases.

Other studies showcasing the use of RNA-seq in rare disease diagnostics:

Genetic diagnosis of Mendelian disorders via RNA  sequencing.

L. S. Kremer, D. M. Bader, C. Mertes, R. Kopajtich, …, Thomas Meitinger, Julien Gagneur@ and Holger Prokisch@, 2017, Nat Commun, 10.1038/ncomms15824

Across a variety of Mendelian disorders, B50–75% of patients do not receive a genetic  diagnosis by exome sequencing indicating disease-causing variants in non-coding regions.  Although genome sequencing in principle reveals all genetic variants, their sizeable number  and poorer annotation make prioritization challenging. Here, we demonstrate the power of  transcriptome sequencing to molecularly diagnose 10% (5 of 48) of mitochondriopathy  patients and identify candidate genes for the remainder. We find a median of one aberrantly  expressed gene, five aberrant splicing events and six mono-allelically expressed rare variants  in patient-derived fibroblasts and establish disease-causing roles for each kind. Private exons  often arise from cryptic splice sites providing an important clue for variant prioritization.  One such event is found in the complex I assembly factor TIMMDC1 establishing a novel  disease-associated gene. In conclusion, our study expands the diagnostic tools for  detecting non-exonic variants and provides examples of intronic loss-of-function variants with  pathological relevance.

Transcriptome-directed analysis for Mendelian disease diagnosis overcomes limitations of conventional genomic testing.

D. Murdock, H. Dai, L. Burrage, J. Rosenfeld, S. Ketkar, M. Müller, V. Yépez, J. Gagneur, ..., Undiagnosed Diseases Network, B. Lee, 2021, J Clin Invest, 10.1172/JCI141500

Background
Transcriptome sequencing (RNA-seq) improves diagnostic rates in individuals with suspected Mendelian conditions to varying degrees, primarily by directing the prioritization of candidate DNA variants identified on exome or genome sequencing (ES/GS). Here we implemented an RNA-seq–guided method to diagnose individuals across a wide range of ages and clinical phenotypes.

Methods
One hundred fifteen undiagnosed adult and pediatric patients with diverse phenotypes and 67 family members (182 total individuals) underwent RNA-seq from whole blood and skin fibroblasts at the Baylor College of Medicine (BCM) Undiagnosed Diseases Network clinical site from 2014 to 2020. We implemented a workflow to detect outliers in gene expression and splicing for cases that remained undiagnosed despite standard genomic and transcriptomic analysis.

Results
The transcriptome-directed approach resulted in a diagnostic rate of 12% across the entire cohort, or 17% after excluding cases solved on ES/GS alone. Newly diagnosed conditions included Koolen–de Vries syndrome (KANSL1), Renpenning syndrome (PQBP1), TBCK-associated encephalopathy, NSD2- and CLTC-related intellectual disability, and others, all with negative conventional genomic testing, including ES and chromosomal microarray (CMA). Skin fibroblasts exhibited higher and more consistent expression of clinically relevant genes than whole blood. In solved cases with RNA-seq from both tissues, the causative defect was missed in blood in half the cases but none from fibroblasts.

Conclusions
For our cohort of undiagnosed individuals with suspected Mendelian conditions, transcriptome-directed genomic analysis facilitated diagnoses, primarily through the identification of variants missed on ES and CMA.

Phenotype-driven genomics enhance diagnosis in children with unresolved neuromuscular diseases.

B. Estévez-Arias, L. Matalonga, ..., R. Luknárová, A. Esteve-Codina, M. Gut, S. Laurie, G. Demidov, V. A. Yépez, S. Beltran, J. Gagneur, ..., F. Palau@ and D. Natera-de Benito@, 2024. Eur J Hum Genet, 10.1038/s41431-024-01699-4

Establishing a molecular diagnosis remains challenging in half of individuals with childhood-onset neuromuscular diseases (NMDs) despite exome sequencing. This study evaluates the diagnostic utility of combining genomic approaches in undiagnosed NMD patients. We performed deep phenotyping of 58 individuals with unsolved childhood-onset NMDs that have previously undergone inconclusive exome studies. Genomic approaches included trio genome sequencing and RNASeq. Genetic diagnoses were reached in 23 out of 58 individuals (40%). Twenty-one individuals carried causal single nucleotide variants (SNVs) or small insertions and deletions, while 2 carried pathogenic structural variants (SVs). Genomic sequencing identified pathogenic variants in coding regions or at the splice site in 17 out of 21 resolved cases, while RNA sequencing was additionally required for the diagnosis of 4 cases. Reasons for previous diagnostic failures included low coverage in exonic regions harboring the second pathogenic variant and involvement of genes that were not yet linked to human diseases at the time of the first NGS analysis. In summary, our systematic genetic analysis, integrating deep phenotyping, trio genome sequencing and RNASeq, proved effective in diagnosing unsolved childhood-onset NMDs. This approach holds promise for similar cohorts, offering potential improvements in diagnostic rates and clinical management of individuals with NMDs.

Want to know more?

Reach out to us today!

We're here to answer all your questions and help you make the best decision.