Classifying promoters by interpreting the hidden information of DNA sequences for disease prediction in clinical laboratories using Gaussian decision boundary estimation

发起人 碱基对 DNA 遗传学 基因 生物 DNA结合位点 抄写(语言学) DNA测序 计算生物学 基因表达 语言学 哲学
作者
S. Pradeepa,Niveda Gaspar,S. Vimal,P. Subbulakshmi,Ahmed Alkhayyat,M. Kaliappan
出处
期刊:Intelligent Decision Technologies [IOS Press]
卷期号:18 (1): 613-631
标识
DOI:10.3233/idt-230283
摘要

A promoter is a brief stretch of DNA (100–1,000 bp) where RNA polymerase starts to transcribe a gene. A DNA (Deoxyribonucleic Acid) base pair is a fundamental unit of DNA structure and represents the pairing of two complementary nucleotide bases within the DNA double helix. The four DNA nucleotide bases are adenine (A), thymine (T), cytosine (C), and guanine (G). DNA base pairs are the building blocks of the DNA molecule, and their complementary pairing is central to the storage and transmission of genetic information in all living organisms. Normally, a promoter is found at the 5′ end of the transcription initiation site or immediately upstream. Numerous human disorders, particularly diabetes, cancer, and Huntington’s disease, have been shown to have DNA promoter as their root cause. The scientific community has long been interested in learning crucial information about protein-coding genes. Finding the promoters is therefore the first step in finding genes in DNA sequences. The scientific world has always been attracted by the effort to glean crucial knowledge about protein-coding genes. Consequently, identifying promoters has emerged as an intriguing challenge that has caught the interest of numerous researchers in the field of bioinformatics. We proposed Gaussian Decision Boundary Estimation in machine learning models to detect transcription start sites (promoters) in the DNA sequences of a common bacteria, Escherichia coli. The best features are identified through a score-based function to select relevant nucleotides that are directly responsible for promoter recognition, in order maximise the models’ performance. The Gaussian Decision Boundary Estimation based support-vector-machine model is trained with these features and finds the best hyperplane that separates the data into different classes. Throughout this study, promoter regions could be identified with high accuracy 99.9% which is better compared to other state of art algorithms. The comparison of machine learning classification models is another major emphasis of this paper in order to identify the model that most accurately predicts DNA sequence promoters. It provides analysis for further biological research as well as precision medicine.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
王小雨完成签到 ,获得积分10
3秒前
wang完成签到,获得积分10
3秒前
4秒前
lulu完成签到 ,获得积分10
5秒前
10秒前
YR完成签到 ,获得积分10
17秒前
量子星尘发布了新的文献求助10
17秒前
hsiuf完成签到,获得积分10
20秒前
Zhao完成签到 ,获得积分10
20秒前
24秒前
Lrcx完成签到 ,获得积分10
30秒前
30秒前
一株多肉完成签到 ,获得积分10
30秒前
量子星尘发布了新的文献求助10
33秒前
zhang完成签到 ,获得积分10
34秒前
浮游应助明理问柳采纳,获得10
39秒前
39秒前
40秒前
峰成完成签到 ,获得积分10
40秒前
量子星尘发布了新的文献求助10
42秒前
42秒前
42秒前
chenyan完成签到,获得积分0
47秒前
库库发布了新的文献求助10
47秒前
ableyy完成签到 ,获得积分10
49秒前
量子星尘发布了新的文献求助10
50秒前
Skywalk满天星完成签到,获得积分10
54秒前
量子星尘发布了新的文献求助10
58秒前
研学弟完成签到,获得积分10
59秒前
大团长完成签到,获得积分10
1分钟前
Lilian完成签到,获得积分10
1分钟前
申燕婷完成签到 ,获得积分10
1分钟前
易止完成签到 ,获得积分10
1分钟前
baoxiaozhai完成签到 ,获得积分10
1分钟前
量子星尘发布了新的文献求助10
1分钟前
1分钟前
1分钟前
1分钟前
量子星尘发布了新的文献求助10
1分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
网络安全 SEMI 标准 ( SEMI E187, SEMI E188 and SEMI E191.) 1000
Inherited Metabolic Disease in Adults: A Clinical Guide 500
计划经济时代的工厂管理与工人状况(1949-1966)——以郑州市国营工厂为例 500
INQUIRY-BASED PEDAGOGY TO SUPPORT STEM LEARNING AND 21ST CENTURY SKILLS: PREPARING NEW TEACHERS TO IMPLEMENT PROJECT AND PROBLEM-BASED LEARNING 500
The Pedagogical Leadership in the Early Years (PLEY) Quality Rating Scale 410
Why America Can't Retrench (And How it Might) 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 催化作用 遗传学 冶金 电极 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 4612966
求助须知:如何正确求助?哪些是违规求助? 4017956
关于积分的说明 12436915
捐赠科研通 3700270
什么是DOI,文献DOI怎么找? 2040657
邀请新用户注册赠送积分活动 1073414
科研通“疑难数据库(出版商)”最低求助积分说明 957049