人工智能
机器学习
支持向量机
朴素贝叶斯分类器
计算机科学
随机森林
分类器(UML)
模式识别(心理学)
数据挖掘
作者
Shakil Sarkar,Krishna Mridha,Ankush Ghosh,Rabindra Nath Shaw
出处
期刊:Lecture notes in electrical engineering
日期:2022-01-01
卷期号:: 335-355
被引量:9
标识
DOI:10.1007/978-981-19-2980-9_27
摘要
The extraction of useful information from deoxyribonucleic acid (DNA) is a major component of bioinformatics research, and DNA sequence categorization has a variety of applications, including genomic and biomedical data processing. DNA sequence classification is a critical problem in a general computational framework for biomedical data processing, and numerous machine learning techniques have been used to complete this task in recent years. Machine learning is a data processing technique that uses training data to create judgments, predictions, classifications, and recognitions. To learn the functions of a new protein, genomic researchers classify DNA sequences into known categories. As a result, it is critical to discover and characterize those genes. We employ machine learning approaches to distinguish between infected and normal genes using classification methods. In this study, we used the multinomial Naive Bayes classifier, SVM, KNN, and others to classify DNA sequences using label and k-mer encoding. Different categorization metrics are used to evaluate the models. The multinomial Naive Bayes classifier, SVM, KNN, decision tree, random forest, and logistic regression with k-mer encoding all have good accuracy on testing data, with 93.16% and 93.13%, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI