剪接
RNA剪接
水准点(测量)
计算机科学
外显子跳跃
机器学习
人工智能
计算生物学
外显子
鉴定(生物学)
选择性拼接
数据挖掘
基因
遗传学
生物
核糖核酸
植物
地理
大地测量学
作者
Hao Liu,Jiaqi Dai,Ke Li,Yang Sun,Haoran Wei,Hong Wang,Chunxia Zhao,Dao Wen Wang
摘要
A critical challenge in genetic diagnostics is the assessment of genetic variants associated with diseases, specifically variants that fall out with canonical splice sites, by altering alternative splicing. Several computational methods have been developed to prioritize variants effect on splicing; however, performance evaluation of these methods is hampered by the lack of large-scale benchmark datasets. In this study, we employed a splicing-region-specific strategy to evaluate the performance of prediction methods based on eight independent datasets. Under most conditions, we found that dbscSNV-ADA performed better in the exonic region, S-CAP performed better in the core donor and acceptor regions, S-CAP and SpliceAI performed better in the extended acceptor region and MMSplice performed better in identifying variants that caused exon skipping. However, it should be noted that the performances of prediction methods varied widely under different datasets and splicing regions, and none of these methods showed the best overall performance with all datasets. To address this, we developed a new method, machine learning-based classification of splice sites variants (MLCsplice), to predict variants effect on splicing based on individual methods. We demonstrated that MLCsplice achieved stable and superior prediction performance compared with any individual method. To facilitate the identification of the splicing effect of variants, we provided precomputed MLCsplice scores for all possible splice sites variants across human protein-coding genes (http://39.105.51.3:8090/MLCsplice/). We believe that the performance of different individual methods under eight benchmark datasets will provide tentative guidance for appropriate method selection to prioritize candidate splice-disrupting variants, thereby increasing the genetic diagnostic yield.
科研通智能强力驱动
Strongly Powered by AbleSci AI