Performance evaluation of computational methods for splice-disrupting variants and improving the performance using the machine learning-based framework

剪接 RNA剪接 水准点(测量) 计算机科学 外显子跳跃 机器学习 人工智能 计算生物学 外显子 鉴定(生物学) 选择性拼接 数据挖掘 基因 遗传学 生物 核糖核酸 植物 地理 大地测量学
作者
Hao Liu,Jing Dai,Ké Li,Yang Sun,Haoran Wei,Hong Wang,Chunxia Zhao,Dao Wen Wang
出处
期刊:Briefings in Bioinformatics [Oxford University Press]
卷期号:23 (5) 被引量:4
标识
DOI:10.1093/bib/bbac334
摘要

A critical challenge in genetic diagnostics is the assessment of genetic variants associated with diseases, specifically variants that fall out with canonical splice sites, by altering alternative splicing. Several computational methods have been developed to prioritize variants effect on splicing; however, performance evaluation of these methods is hampered by the lack of large-scale benchmark datasets. In this study, we employed a splicing-region-specific strategy to evaluate the performance of prediction methods based on eight independent datasets. Under most conditions, we found that dbscSNV-ADA performed better in the exonic region, S-CAP performed better in the core donor and acceptor regions, S-CAP and SpliceAI performed better in the extended acceptor region and MMSplice performed better in identifying variants that caused exon skipping. However, it should be noted that the performances of prediction methods varied widely under different datasets and splicing regions, and none of these methods showed the best overall performance with all datasets. To address this, we developed a new method, machine learning-based classification of splice sites variants (MLCsplice), to predict variants effect on splicing based on individual methods. We demonstrated that MLCsplice achieved stable and superior prediction performance compared with any individual method. To facilitate the identification of the splicing effect of variants, we provided precomputed MLCsplice scores for all possible splice sites variants across human protein-coding genes (http://39.105.51.3:8090/MLCsplice/). We believe that the performance of different individual methods under eight benchmark datasets will provide tentative guidance for appropriate method selection to prioritize candidate splice-disrupting variants, thereby increasing the genetic diagnostic yield.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
Philip发布了新的文献求助20
1秒前
nengzou完成签到 ,获得积分10
1秒前
1秒前
刘坦苇发布了新的文献求助10
1秒前
liubin完成签到,获得积分10
2秒前
CodeCraft应助shark采纳,获得10
2秒前
ftx完成签到,获得积分20
2秒前
不安青牛应助YC采纳,获得20
2秒前
2秒前
刘欢发布了新的文献求助10
3秒前
哈哈发布了新的文献求助10
3秒前
3秒前
菠萝菠萝哒应助七里香采纳,获得10
3秒前
毛豆应助manan采纳,获得10
4秒前
WW发布了新的文献求助10
4秒前
王凯完成签到,获得积分10
5秒前
ftx发布了新的文献求助10
6秒前
7秒前
欣喜鹏煊发布了新的文献求助10
7秒前
顾矜应助贪玩冰绿采纳,获得10
7秒前
完美世界应助哈哈哈采纳,获得10
8秒前
钰钰yuyu发布了新的文献求助10
8秒前
cy发布了新的文献求助30
9秒前
9秒前
10秒前
ruohanyu发布了新的文献求助10
10秒前
11秒前
12秒前
helinjie1990发布了新的文献求助10
13秒前
葡萄成熟发布了新的文献求助10
14秒前
15秒前
accept完成签到,获得积分10
15秒前
16秒前
16秒前
山橘月发布了新的文献求助10
17秒前
无花果应助唠叨的白曼采纳,获得10
17秒前
小宋是只科研狗完成签到,获得积分20
18秒前
任伟超发布了新的文献求助10
18秒前
18秒前
高分求助中
Production Logging: Theoretical and Interpretive Elements 2500
Востребованный временем 2500
Aspects of Babylonian celestial divination : the lunar eclipse tablets of enuma anu enlil 1500
Agaricales of New Zealand 1: Pluteaceae - Entolomataceae 1040
Healthcare Finance: Modern Financial Analysis for Accelerating Biomedical Innovation 1000
Classics in Total Synthesis IV: New Targets, Strategies, Methods 1000
지식생태학: 생태학, 죽은 지식을 깨우다 600
热门求助领域 (近24小时)
化学 医学 材料科学 生物 工程类 有机化学 生物化学 纳米技术 内科学 物理 化学工程 计算机科学 复合材料 基因 遗传学 物理化学 催化作用 细胞生物学 免疫学 电极
热门帖子
关注 科研通微信公众号,转发送积分 3459337
求助须知:如何正确求助?哪些是违规求助? 3053819
关于积分的说明 9038735
捐赠科研通 2743154
什么是DOI,文献DOI怎么找? 1504672
科研通“疑难数据库(出版商)”最低求助积分说明 695354
邀请新用户注册赠送积分活动 694664