摘要
Human MutationVolume 43, Issue 8 p. 1056-1070 SPECIAL ARTICLEOpen Access Guidelines for clinical interpretation of variant pathogenicity using RNA phenotypes Dmitrii Smirnov, Dmitrii Smirnov orcid.org/0000-0002-5802-844X School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, GermanySearch for more papers by this authorLea D. Schlieben, Lea D. Schlieben School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, GermanySearch for more papers by this authorFatemeh Peymani, Fatemeh Peymani School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, GermanySearch for more papers by this authorRiccardo Berutti, Riccardo Berutti School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, GermanySearch for more papers by this authorHolger Prokisch, Corresponding Author Holger Prokisch [email protected] orcid.org/0000-0003-2379-6286 School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, Germany Correspondence Holger Prokisch, School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany. Email: [email protected]Search for more papers by this author Dmitrii Smirnov, Dmitrii Smirnov orcid.org/0000-0002-5802-844X School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, GermanySearch for more papers by this authorLea D. Schlieben, Lea D. Schlieben School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, GermanySearch for more papers by this authorFatemeh Peymani, Fatemeh Peymani School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, GermanySearch for more papers by this authorRiccardo Berutti, Riccardo Berutti School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, GermanySearch for more papers by this authorHolger Prokisch, Corresponding Author Holger Prokisch [email protected] orcid.org/0000-0003-2379-6286 School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, Germany Correspondence Holger Prokisch, School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany. Email: [email protected]Search for more papers by this author First published: 29 May 2022 https://doi.org/10.1002/humu.24416AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Abstract Over the last 5 years, RNA sequencing (RNA-seq) has been established and is increasingly applied as an effective approach complementary to DNA sequencing in molecular diagnostics. Currently, three RNA phenotypes, aberrant expression, aberrant splicing, and allelic imbalance, are considered to provide information about pathogenic variants. By providing a high-throughput, transcriptome-wide functional readout on variants causing aberrant RNA phenotypes, RNA-seq has increased diagnostic rates by about 15% over whole-exome sequencing. This breakthrough encouraged the development of computational tools and pipelines aiming to streamline RNA-seq analysis for implementation in clinical diagnostics. Although a number of studies showed the added value of RNA-seq for the molecular diagnosis of individuals with Mendelian disorders, there is no formal consensus on assessing variant pathogenicity strength based on RNA phenotypes. Taking RNA-seq as a functional assay for genetic variants, we evaluated the value of statistical significance and effect size of RNA phenotypes as evidence for the strength of variant pathogenicity. This was determined by the analysis of 394 pathogenic variants, of which 198 were associated with aberrant RNA phenotypes and 723 benign variants. Overall, this study seeks to establish recommendations for integrating functional RNA-seq data into the the American College of Medical Genetics and Genomics and the Association for Molecular Pathology guidelines classification system. 1 INTRODUCTION ACMG guidelines to standardize clinical variant interpretation Routine clinical implementation of whole-exome (WES), whole-genome, and panel sequencing have led to the detection of thousands of rare variants per patient, shifting the major challenge of genetic testing from variant detection toward variant interpretation. To standardize the diagnostic process, the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) established guidelines for the interpretation of genetic variants identified by DNA sequencing (DNA-seq) in 2015 (Richards et al., 2015). The ACMG/AMP guidelines comprise 28 criteria stratified by the type and level of strength of evidence of variant pathogenicity. When combined, these criteria contribute to the classification of variants into a five-tiered system: pathogenic (P), likely pathogenic (LP), variant of uncertain significance (VUS), likely benign (LB), or benign (B) (Figure 1a). Figure 1Open in figure viewerPowerPoint Distribution of clinically relevant variants reported in the ClinVar database. (a) The proportion of variants reported in ClinVar was stratified by their clinical significance. (b) The proportion of Pathogenic/Likely Pathogenic variants stratified by the variant types. (c) The proportion of VUS stratified by their variant types. UTR, synonymous, intronic, PTV and duplication variants have the potential to affect RNA phenotypes and are indicated by the gray line. The indicated percentages are according to the data extracted from Simple ClinVar on May 31, 2021. The dashed gray line indicates the 25%, expected proportion of missense variants having the potential to alter RNA phenotypes (Cartegni et al., 2002; Dionnet et al., 2020; Savisaar & Hurst, 2017). PTV variants include nonsense, frameshift, splice donor, splice acceptor, and deletion variants. VUS, variant of uncertain significance; PTV, protein-truncating variants; UTR, untranslated region. Variant types and their pathogenicity While less than 20% of the variants submitted to ClinVar (Landrum et al. 2014, 2016), a public server of genetic variants and their clinical significance, are classified as likely pathogenic/pathogenic and about 30% are likely benign/benign, more than 50% fall into the category of VUS (Figure 1a) (Pérez-Palma et al., 2019). Protein truncating variants (PTVs; nonsense, frameshift, canonical splice sites [±1 or ±2 intronic positions], initiation codon, and deletion) represent the most frequent type of variants in the pathogenic and likely pathogenic categories. Pathogenic PTVs result in the absence of a functionally important part of the expressed protein or trigger nonsense-mediated RNA decay (NMD) leading to no/minimal amounts of the expressed truncated protein (Brandt et al., 2020) (Figure 1b). Therefore, PTVs are the only variant type that can be assigned with the very strong level of pathogenicity (PVS1) purely based on computational predictions. In combination with at least one moderate criterion, like matching a patient's phenotype, such variants are classified as likely pathogenic (Richards et al., 2015). Variants of uncertain significance Variants with less clearly predicted molecular consequences and insufficient or conflicting evidence are classified as VUS (Figure 1c). The largest fraction of VUS is missense and inframe indel (insertion/deletion) variants. For those variants, the prediction of the functional consequences and clinical relevance has low accuracy. Moreover, VUS in the noncoding regions (intronic, intergenic, untranslated region [UTR], etc.), are rarely prioritized by diagnostic pipelines but have the potential to affect gene expression or splicing and cause aberrant RNA phenotypes resulting in clinically relevant reduced protein function. Through the widespread usage of high-throughput DNA-seq techniques, variant detection is outpacing the ability of variant interpretation, consequently leading to a constantly increasing amount of VUS (Starita et al., 2017). According to ACMG/AMP guidelines, VUS can not be the basis for clinical decision making but additional evidence is required for clarification of the functional consequences of these variants. Functional assays for reclassifying VUS and limitations Functional data has been shown to be one of the best types of evidence for the reclassification of VUS. Hence the ACMG/AMP framework determines well-established in vivo or in vitro functional studies as strong evidence (PS3/BS3) for variant interpretation (Brnich et al., 2018; Richards et al., 2015). However, as functional assays are typically gene-specific and require special knowledge and equipment, they are only rarely established in routine clinical diagnostics (Gelman et al., 2019). In addition, variants are often private to each patient and have not been tested beforehand. High-throughput functional assays are needed to test the full spectrum of genetic variants in each gene. Such assays have been developed for some genes focussing on coding variants (Findlay et al., 2018; Matreyek et al., 2018) but are much more difficult for noncoding variants. Hence, novel strategies helping variant interpretation are required. RNA sequencing (RNA-seq) as transcriptome-wide functional read-out RNA-seq, a genome-wide tool for functional characterization and quantification of transcript levels and isoforms, can aid variant interpretation when applied on a patient sample. It serves for the quantification of gene expression or splicing and allows for the detection of relative changes in RNA phenotypes within patient cohorts. RNA-seq analysis facilitates validation of regulatory effects of VUS located in coding and noncoding regions on RNA phenotypes for thousands of genes in a single standardized assay. Depending on the tissue this may cover up to 90% of known disease genes (Gonorazky et al., 2019; Yépez et al., 2022). Moreover, the comprehensive transcriptome-wide analysis may discover disease-relevant RNA phenotypes not expected based on the interpretation of genome sequences. The universal functional readout aids to streamline the functional interpretation of variants and provides at the same time information on the normal physiological range of RNA phenotypes for all expressed genes not affected by the disease. Statistical analysis of RNA-seq data thereby enables the systematic identification of aberrant RNA phenotypes, defined as (1) genes expressed at aberrant levels, (2) monoallelic expressed variants, and (3) aberrantly spliced genes (Figure 2) (Cummings et al., 2017; Frésard et al., 2019; Gonorazky et al., 2019; Kremer et al., 2017). The ability to detect these outlier events deems RNA-seq an invaluable tool for the reclassification of VUS. Figure 2Open in figure viewerPowerPoint RNA phenotypes caused by genetic defects. RNA-seq enables the detection of aberrant RNA phenotypes via (1) aberrant expression, (2) monoallelic expression, and (3) aberrant splicing. Aberrant RNA phenotypes can be caused by a broad spectrum of distinct variants in exonic, intronic, and regulatory regions. Different outlier detection methods have been adapted or developed for the analysis of RNA-seq data. Aberrant RNA phenotypes are labeled in purple. ANEVA-DOT, ANalysis of Expression VAriation-Dosage Outlier Test; FRASER, Find RAre Splicing Events in RNA-seq; OUTRIDER, Outlier in RNA-Seq Finder; PTV, protein-truncating variant; RNA-seq, RNA sequencing; SPOT, SPlicing Outlier deTection; UTR, untranslated region. 2 ABERRANT RNA PHENOTYPES Aberrant expression Aberrant expression, identified as gene expression outliers outside the physiological range, often presents with low levels of gene expression (Kremer et al., 2017). Depending upon whether one or both alleles are affected, a moderate or severe reduction in gene expression and consequently protein function is observed. Transcripts with nonsense variants are frequently degraded via nonsense-mediated decay, which can be detected by aberrant underexpression of genes. Besides nonsense and frameshift variants, also splice variants often result in the creation of premature termination codons. Additionally, noncoding variants in regulatory regions such as promoters, enhancers, or suppressors, variants in the untranslated or intronic region, or large deletions have the potential to cause aberrant underexpression of disease genes (Ferraro et al., 2020). Gene expression levels are quantified by the number of read counts mapping to transcript isoforms of genes. These read counts thereby allow measuring the impact of variants on steady-state RNA expression level. Within the first study applying RNA-seq in rare disease diagnostics, outliers were originally called by DESeq2, a method developed for differential gene expression analysis (Kremer et al., 2017; Love et al., 2014). Other studies did not apply a formal statistical test, but computed z-scores on the log-transformed gene-length-normalized read counts and used manually defined threshold to define aberrant expression (Cummings et al., 2017; Gonorazky et al., 2019). Later, specific methods such as OUTRIDER (OUTlier in RNA-seq fInDER, Brechtmann et al., 2018) have been developed for the systematic detection of expression outliers in RNA-seq data. Monoallelic expression (MAE) Apart from aberrant expression, RNA-seq provides information about allele-specific expression, whereby primarily one allele out of the two alleles is expressed (at least 80% of reads as defined by Yepez, Mertes, et al., 2021) and can be detected as MAE. MAE is a specific form of aberrant expression and an extreme form of allelic imbalance. It often escapes detection by aberrant expression since expression of mainly one allele does not always result in expression levels outside the physiological range (Yépez et al., 2022). Nevertheless, MAE can indicate the presence of a clinically relevant situation. Under the assumption of a recessive inheritance model, rare monoallelic DNA variants are not prioritized. Thereby, detection of MAE of a rare variant indicates a previously unidentified defect of the second allele, such as a promoter variant resulting in loss of expression of the second allele. Hence, MAE can reprioritise rare heterozygous variants detected by DNA-seq. The reasons for reduced expression of an allele in MAE can be diverse and may be due to genetic as well as epigenetic reasons, such as inactivation of the X chromosome and imprinting of autosomal genes (Bartolomei, 2009; Ferraro et al., 2020; J. T. Lee & Bartolomei, 2013; Lyon, 1961). Using RNA-seq monoallelic events are detected by counting the reads aligned to each expressed allele at genomic positions of heterozygous single-nucleotide variants. Different methods have been developed for MAE detection, including negative binomial test (Kremer et al., 2017) and ANEVA-DOT (ANalysis of Expression Variation-Dosage Outlier Test) (Mohammadi et al., 2019). While the negative binomial test uses a fixed dispersion for all genes, ANEVA-DOT takes into account gene-specific variance that promises better performance. However, as ANEVA-DOT is not applicable for all genes so far, the negative binomial test has been mostly applied for MAE detection. Aberrant splicing Finally, aberrant splicing of a gene is a long-known cause of genetic diseases, which can be detected by RNA-seq (Scotti & Swanson, 2016; Singh & Cooper, 2012; Tazi et al., 2009). The majority of human genes are spliced, usually resulting in multiple transcript isoforms. Being a tightly regulated process, various variant types can disrupt splicing. The most canonical example, splice site variants, located at the exon−intron boundary, frequently, but not always lead to clear splice defects. In addition, intronic and coding variation can lead to splicing disruption. Quantitative predictions of aberrant splicing, based on genetic variants outside the splice regions, are usually inaccurate and rarely provide sufficient evidence for assessing the variants' pathogenicity (Ferraro et al., 2020). RNA-seq allows quantification of splicing events by detection of split reads, whose ends align to distinct sequence elements. For accurate detection of aberrant splicing for diagnostic purposes, different methods including FRASER (Find Rare Splicing Events in RNA-seq) (Mertes et al., 2021), SPOT (SPlicing Outlier deTection) (Ferraro et al., 2020), and LeafCutter/LeafCutterMD (LeafCutter for Mendelian disease) (Jenkinson et al., 2020; Y. I. Li et al., 2018) have been established. Introduction of RNA-seq data into the ACMG/AMP variant interpretation framework using evidence strength Across RNA-seq studies, different statistical methods, metrics and thresholds were used to identify outliers and subsequently provide pathogenicity evidence to underlying variants. In addition, various technical and biological factors can have an impact on RNA-seq readout, bringing uncertainty in evidence strength. Although the diagnostic benefit in aiding variant interpretation in rare diseases has been shown within these studies, no detailed thresholds and recommendations exist. Aiming to standardize diagnostic procedures and integrate RNA-seq analysis in the ACMG/AMP framework, we evaluated quantitative metrics of RNA phenotypes and provide recommendations on RNA-seq application in clinical practice. Our recommendations on quantitative RNA-seq data interpretation are based on the evidence strength evaluation proposed by Brnich et al. (2019) by evaluation of the performance of RNA phenotypes to classify variants as pathogenic or benign. 3 MATERIALS AND METHODS Public data acquisition and analysis cohort For the analysis of the diagnostic power of clinical RNA-seq, we collected data from eight studies systematically detecting RNA phenotypes with a minimum of 25 cases (Cummings et al., 2017; Frésard et al., 2019; Gonorazky et al., 2019; Kopajtich et al., 2021; Kremer et al., 2017; H. Lee et al., 2020; Murdock et al., 2021; Yépez et al., 2022; Supporting Information: Table S1). Causal gene and variant information, as well as available data on RNA phenotypes, from 178 genetically diagnosed cases were extracted from the text and the Supporting Information Material of the corresponding studies (Supporting Information: Table S2). This data set includes 119 cases from Yépez et al. (2022) study, from which WES and RNA-seq data was available in-house. All individuals included in the study or their legal guardians provided written informed consent before evaluation, in agreement with the Declaration of Helsinki and approved by the ethical committees of the centers participating in this study, where biological samples were obtained. Whole exome sequencing data and analysis Variant annotation of WES data was performed as described in (Yépez et al., 2022). In brief, reads were aligned to the human reference genome (UCSC build hg19) using the Burrows−Wheeler Aligner (BWA) v0.7.5a (H. Li & Durbin, 2009). Variants were called with Genome Analysis ToolKit (GATK) v3.8 (Van der Auwera et al., 2013) and annotated with Variant Effect Predictor (VEP) v1.32.0 (McLaren et al., 2016). In addition, automatic interpretation of rare variants (minor allele frequency < 0.01; MAF) with ACMG guidelines was performed with InterVar software using default parameters (Li & Wang, 2017). RNA-seq data analysis For quantification and analysis of RNA phenotype metrics, the compendium of RNA-seq data described in Yépez et al. (2022) was used. The compendium includes 70 individuals from Kremer et al. (2017), 152 individuals from Kopajtich et al. (2021), and 81 additional individuals recruited by Yépez et al. (2022). The data set consists of 303 fibroblast cell lines derived from patients with suspected Mendelian disorders. Gene expression and splicing counts are available via Zenodo: strand-specific (Yepez, 2021) and nonstrand specific (Yepez, et al., 2021). Aberrant RNA phenotypes were detected as described in the Yépez et al. (2022) study using the DROP pipeline. In brief, aberrant expression was detected using the OUTRIDER package (Brechtmann et al., 2018), and four metrics were obtained: fold-change, z-score, p value and p adjusted. For this study OUTRIDER was selected for aberrant expression detection as it has been shown to outperform other methods based on the z-score transformation of RNA-seq data in three different benchmarks (Brechtmann et al., 2018). Aberrant splicing was called with the FRASER package (Mertes et al., 2021), resulting in the following metrics: delta PSI (delta percent spliced in, Δψ) and delta Theta (delta of splicing efficiency, Δθ) calculated for both 5′ and 3′ splices sites, as well as p value and p adjusted. Algorithm utilizes RNA-seq split reads, non-contiguous reads whose ends align to two separated genomic locations of the same chromosome strand and are, therefore, evidence of splicing events. The percent-spliced-in (ψ) is calculated as the ratio between split-reads spanning the given intron and all split-reads sharing the same donor (5′) or acceptor site (3′), respectively. The splicing efficiency (θ) is calculated as the ratio of all split-reads and the full read coverage at a given splice site. Although other methods exist for calling aberrant splicing events, such as SPOT and LeafCutterMD, FRASER was the method of choice for this study. Within a benchmarking study of three different aberrant splicing detection methods, FRASER obtained the highest enrichment of rare splice variants (Mertes et al., 2021). MAE was detected using the negative binomial test (Kremer et al., 2017) computing, for each heterozygous variant, an alternative allele ratio, p value and p adjusted. Allelic ratio is defined for each heterozygous variant as the ratio of reads mapped to alternative allele in relation to the total number of reads mapped at this position. No formal benchmarking has been done to evaluate the performance of methods detecting MAE. However, since ANEVA-DOT (v.0.1.1) is currently limited only to 6365 genes expressed in fibroblasts, the negative binomial test was chosen for the detection of monoallelic events. Variant classification based on predicted functional consequence A series of variant categorizations were performed based on the predicted functional consequence. First, for the analysis of variants reported in the ClinVar database, nonsense, frameshift, canonical splice sites (±1 or ±2 intronic positions), initiation codon, single or multiexon deletions were categorized as "PTV." Next, for the variants reported pathogenic in the eight RNA-seq studies, we grouped promoter, 5′ untranslated region (5′ UTR), 3′ UTR, in-frame indel, and start-loss variants as category "Other" due to the small number of individuals carrying them. For all posterior analyses variants were divided into four types based on their location and predicted functional consequence. "PTV" included nonsense, frameshift, deletion, and start-loss variants, "Splice" combined canonical splice sites, and variants in splice region, refers to variants in the first/last nucleotide of an exon, the +3 to +6 intron position (splice donor site) and variants generating a new AG-dinucleotide directly upstream of a splice acceptor site (AG). While the "Non-coding" type comprised intronic, promoter, 5′ UTR, 3′ UTR, copy number variation and intergenic variants. Finally, the "Coding" category included missense, synonymous, stop-loss and inframe insertion and deletion variants. Calculation of OddsPath The magnitude of evidence strength provided by RNA phenotypes was estimated based on a framework proposed by Brnich et al. (2019) and calculation of the odds of pathogenicity (OddsPath, Tavtigian et al., 2018). OddsPath was computed as OddsPath = [P2 × (1 − P1)]/[(1 − P2) × P1], where P1 is the prior probability, calculated as the proportion of pathogenic variants in the overall data. P2 is the posterior probability, defined as the proportion of pathogenic variants with functionally abnormal (aberrant) RNA phenotypes. A set of known benign and pathogenic variants is required for the OddsPath calculation. A total of 394 pathogenic variants were selected for the OddsPath calculations based on two inclusion criteria: (1) pathogenic variants located in genes expressed in fibroblasts and reported as disease-causing for the 119 genetically diagnosed individuals described by Yépez et al. (2022). (2) ClinVar pathogenic or likely pathogenic variants located in genes expressed in fibroblasts and detected across the full cohort of 303 individuals (Yépez et al., 2022) (Supporting Information: Table S3). A total of 723 benign variants were selected based on the following two criteria: (1) rare variants with a MAF < 0.01 reported benign or likely benign in the ClinVar database (Landrum et al., 2014, 2016) and classified as benign or likely benign according to ACMG/AMP criteria as implemented in the InterVar software (Li and Wang, 2017). (2) as the first procedure resulted in a low number of PTV variants, nonsense and frameshift variants detected in causal genes with a MAF > 0.05 were additionally included, as suggested by Brnich et al. (2019) (Supporting Information: Table S3). OddsPath analysis was performed separately for monoallelic and biallelic genetic defects. Homozygous and compound heterozygous variants were considered biallelic, heterozygous as monoallelic. An exception was made for nonmissense variants compound heterozygous with missense alleles, which were considered as monoallelic because missense variants typically do not result in aberrant RNA phenotypes. For each RNA phenotype, the OddsPath was calculated given different thresholds and was interpreted based on the evidence strength equivalents provided by Brnich et al. (2019). An OddsPath > 2.1 was considered as PS3 supporting, OddsPath > 4.3 as PS3 moderate, OddsPath > 18.7 as PS3 (strong), and OddsPath > 350 as PS3 very strong. 4 RESULTS Overview of studies implementing clinical RNA-seq To date, eight studies applied RNA-seq in large-scale, with at least 70 individuals in the cohort and a minimum of 25 affected individuals, aiming to reclassify VUS or to identify disease-causing genes and variants (Cummings et al., 2017; Frésard et al., 2019; Gonorazky et al., 2019; Kopajtich et al., 2021; Kremer et al., 2017; LHee et al., 2020; Murdock et al., 2021; Yépez et al., 2022; Supporting Information: Table S1). The median reported RNA-seq diagnostic rate is 15% (Figure 3a). For 74% (132/178) of cases, pathogenic variants were identified in genes associated with diseases with an autosomal recessive mode of inheritance. We extracted variant and RNA phenotype information from 178 genetically diagnosed cases from the corresponding literature (Supporting Information: Table S2). In 120 out of the 178 cases at least one RNA phenotype was detected. Aberrant expression and aberrant splicing were the most common RNA phenotypes contributing to diagnosis in 64% and 62% of cases, respectively, (Figure 3b). In addition, as aberrant splicing often created premature stop codons causing NMD, almost in half of these cases it also led to aberrant expression. Detection of MAE contributed to diagnosis in 27% of cases. Figure 3Open in figure viewerPowerPoint Power of RNA-seq in diagnostics of rare Mendelian disorders. (a) Scatterplot showing number of cases diagnosed by RNA-seq and initial undiagnosed cohort size across eight studies. Numbers underlying this figure, as well as diagnostic rates, can be found in Supporting Information: Table S1. (b) Frequency of detection of aberrant expression, aberrant splicing and monoallelic expression of causal genes across 120 genetically diagnosed individuals with at least one aberrant RNA phenotype detected. (c) Proportions of pathogenic alleles causing aberrant RNA phenotypes. Proportions were calculated separately for each RNA phenotype and for cases with detected and not detected aberrant RNA phenotypes. For mono-allelic expression, only alleles causing this phenotype were considered. Missense variants are indicated in gray as they typically do not cause aberrant RNA phenotypes. Data underlying panels (b) and (c) can be found in Supporting Information: Table S2. RNA-seq, RNA sequencing. Variants underlying RNA phenotypes Across all studies, pathogenic variants were discovered in genes with known loss-of-function mechanisms for recessi