Prediction of Super-enhancers Based on Mean-shift Undersampling

增强子 计算机科学 分类器(UML) 数据挖掘 计算生物学 模式识别(心理学) 人工智能 机器学习 转录因子 生物 基因 遗传学
作者
Cheng Han,Shumei Ding,Cangzhi Jia
出处
期刊:Current Bioinformatics [Bentham Science]
卷期号:19 (7): 651-662 被引量:1
标识
DOI:10.2174/0115748936268302231110111456
摘要

Background: Super-enhancers are clusters of enhancers defined based on the binding occupancy of master transcription factors, chromatin regulators, or chromatin marks. It has been reported that super-enhancers are transcriptionally more active and cell-type-specific than regular enhancers. Therefore, it is necessary to identify super-enhancers from regular enhancers. A variety of computational methods have been proposed to identify super-enhancers as auxiliary tools. However, most methods use ChIP-seq data, and the lack of this part of the data will make the predictor unable to execute or fail to achieve satisfactory performance. Objective: The aim of this study is to propose a stacking computational model based on the fusion of multiple features to identify super-enhancers in both human and mouse species. Methods: This work adopted mean-shift to cluster majority class samples and selected four sets of balanced datasets for mouse and three sets of balanced datasets for human to train the stacking model. Five types of sequence information are used as input to the XGBoost classifier, and the average value of the probability outputs from each classifier is designed as the final classification result. Results: The results of 10-fold cross-validation and cross-cell-line validation prove that our method has superior performance compared to other existing methods. The source code and datasets are available at https://github.com/Cheng-Han-max/SE_voting. Conclusion: The analysis of feature importance indicates that Mismatch accounts for the highest proportion among the top 20 important features.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

祝大家在新的一年里科研腾飞
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
刚刚
聪慧芷巧完成签到,获得积分10
1秒前
1秒前
攀攀完成签到,获得积分10
3秒前
邹泰然发布了新的文献求助10
3秒前
风清扬发布了新的文献求助10
5秒前
whichwhy完成签到,获得积分20
5秒前
Owen_Hu_11完成签到,获得积分10
6秒前
天天快乐应助晨子采纳,获得10
6秒前
keikei发布了新的文献求助10
6秒前
Owen_Hu_11发布了新的文献求助10
9秒前
13秒前
14秒前
柚子完成签到,获得积分10
17秒前
Popo完成签到,获得积分10
17秒前
18秒前
Shengee完成签到,获得积分10
18秒前
19秒前
shelemi发布了新的文献求助10
19秒前
20秒前
Shengee发布了新的文献求助10
21秒前
22秒前
22秒前
晨子发布了新的文献求助10
23秒前
思源应助犹豫的靖仇采纳,获得10
23秒前
23秒前
风清扬发布了新的文献求助10
23秒前
26秒前
顺利的夜南完成签到 ,获得积分10
27秒前
ad完成签到,获得积分20
27秒前
邹泰然完成签到,获得积分20
30秒前
科研通AI6.1应助yueang采纳,获得10
30秒前
ad发布了新的文献求助10
32秒前
33秒前
芙蓉发布了新的文献求助10
34秒前
打铁佬发布了新的文献求助10
41秒前
crazyrock发布了新的文献求助10
42秒前
45秒前
45秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Les Mantodea de guyane 2500
Signals, Systems, and Signal Processing 510
Discrete-Time Signals and Systems 510
The Dance of Butch/Femme: The Complementarity and Autonomy of Lesbian Gender Identity 500
Driving under the influence: Epidemiology, etiology, prevention, policy, and treatment 500
Differentiation Between Social Groups: Studies in the Social Psychology of Intergroup Relations 350
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5877764
求助须知:如何正确求助?哪些是违规求助? 6545523
关于积分的说明 15682183
捐赠科研通 4996442
什么是DOI,文献DOI怎么找? 2692710
邀请新用户注册赠送积分活动 1634734
关于科研通互助平台的介绍 1592400