亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods

自相关 计算机科学 鉴定(生物学) 人工智能 领域(数学) 模式识别(心理学) 蛋白质测序 过程(计算) 计算生物学 数据挖掘 序列(生物学) 特征提取 特征(语言学) 生物 数学 肽序列 遗传学 基因 统计 操作系统 生物化学 哲学 语言学 纯数学 植物
作者
Jun Zhang,Bin Liu
出处
期刊:Current Bioinformatics [Bentham Science]
卷期号:14 (3): 190-199 被引量:129
标识
DOI:10.2174/1574893614666181212102749
摘要

Background: Proteins play a crucial role in life activities, such as catalyzing metabolic reactions, DNA replication, responding to stimuli, etc. Identification of protein structures and functions are critical for both basic research and applications. Because the traditional experiments for studying the structures and functions of proteins are expensive and time consuming, computational approaches are highly desired. In key for computational methods is how to efficiently extract the features from the protein sequences. During the last decade, many powerful feature extraction algorithms have been proposed, significantly promoting the development of the studies of protein structures and functions. Objective: To help the researchers to catch up the recent developments in this important field, in this study, an updated review is given, focusing on the sequence-based feature extractions of protein sequences. Method: These sequence-based features of proteins were grouped into three categories, including composition-based features, autocorrelation-based features and profile-based features. The detailed information of features in each group was introduced, and their advantages and disadvantages were discussed. Besides, some useful tools for generating these features will also be introduced. Results: Generally, autocorrelation-based features outperform composition-based features, and profile-based features outperform autocorrelation-based features. The reason is that profile-based features consider the evolutionary information, which is useful for identification of protein structures and functions. However, profile-based features are more time consuming, because the multiple sequence alignment process is required. Conclusion: In this study, some recently proposed sequence-based features were introduced and discussed, such as basic k-mers, PseAAC, auto-cross covariance, top-n-gram etc. These features did make great contributions to the developments of protein sequence analysis. Future studies can be focus on exploring the combinations of these features. Besides, techniques from other fields, such as signal processing, natural language process (NLP), image processing etc., would also contribute to this important field, because natural languages (such as English) and protein sequences share some similarities. Therefore, the proteins can be treated as documents, and the features, such as k-mers, top-n-grams, motifs, can be treated as the words in the languages. Techniques from these filed will give some new ideas and strategies for extracting the features from proteins.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
19秒前
采薇发布了新的文献求助10
25秒前
1分钟前
1分钟前
1分钟前
1分钟前
YY发布了新的文献求助10
1分钟前
gy完成签到,获得积分10
1分钟前
我是大兴发布了新的文献求助10
1分钟前
tiny发布了新的文献求助10
1分钟前
我是大兴完成签到,获得积分10
2分钟前
wanci应助逆天大脚采纳,获得10
2分钟前
2分钟前
Hello应助采薇采纳,获得10
2分钟前
共享精神应助小鳄鱼夸夸采纳,获得10
2分钟前
2分钟前
3分钟前
逆天大脚发布了新的文献求助10
3分钟前
3分钟前
3分钟前
英姑应助小鳄鱼夸夸采纳,获得10
3分钟前
3分钟前
tiny完成签到 ,获得积分10
3分钟前
Lancer1034完成签到,获得积分10
3分钟前
顾矜应助J11采纳,获得10
3分钟前
采薇发布了新的文献求助10
3分钟前
3分钟前
汉堡包应助采薇采纳,获得10
3分钟前
J11发布了新的文献求助10
3分钟前
高高的笑柳完成签到 ,获得积分10
3分钟前
哈哈带发布了新的文献求助30
4分钟前
4分钟前
隐形曼青应助wen采纳,获得10
4分钟前
J11完成签到,获得积分10
4分钟前
桐桐应助复杂的书南采纳,获得10
4分钟前
Demi_Ming完成签到,获得积分10
4分钟前
xyu完成签到,获得积分10
4分钟前
4分钟前
4分钟前
wen发布了新的文献求助10
4分钟前
高分求助中
The late Devonian Standard Conodont Zonation 2000
The Lali Section: An Excellent Reference Section for Upper - Devonian in South China 1500
Nickel superalloy market size, share, growth, trends, and forecast 2023-2030 1000
Smart but Scattered: The Revolutionary Executive Skills Approach to Helping Kids Reach Their Potential (第二版) 1000
Mantiden: Faszinierende Lauerjäger Faszinierende Lauerjäger 800
PraxisRatgeber: Mantiden: Faszinierende Lauerjäger 800
A new species of Coccus (Homoptera: Coccoidea) from Malawi 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3244727
求助须知:如何正确求助?哪些是违规求助? 2888396
关于积分的说明 8252804
捐赠科研通 2556854
什么是DOI,文献DOI怎么找? 1385423
科研通“疑难数据库(出版商)”最低求助积分说明 650157
邀请新用户注册赠送积分活动 626265