亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

ProPythia: A Python package for protein classification based on machine and deep learning

计算机科学 Python(编程语言) 人工智能 机器学习 模块化设计 深度学习 降维 聚类分析 特征选择 数据挖掘 程序设计语言
作者
Ana Marta Sequeira,Diana Lousa,Miguel Rocha
出处
期刊:Neurocomputing [Elsevier BV]
卷期号:484: 172-182 被引量:21
标识
DOI:10.1016/j.neucom.2021.07.102
摘要

The field of protein data mining has been growing rapidly in the last years. To characterize proteins and determine their function from their amino acid sequences are challenging and long-standing problems, where Bioinformatics and Machine Learning have an emergent role. A myriad of machine and deep learning algorithms have been applied in these tasks with exciting results. However, tools and platforms to calculate protein features and perform both Machine Learning (ML) and Deep Learning (DL) pipelines, taking as inputs protein sequences, are still lacking and have their limitations in terms of performance, user-friendliness and restricted domains of application. Here, to address these limitations, we propose ProPythia, a generic and modular Python package that allows to easily deploy ML and DL approaches for a plethora of problems in protein sequence analysis and classification. It facilitates the implementation, comparison and validation of the major tasks in ML or DL pipelines including modules to read and alter sequences, calculate protein features, preprocess datasets, execute feature selection and dimensionality reduction, perform clustering and manifold analysis, as well as to train and optimize ML/DL models and use them to make predictions. ProPythia has an adaptable modular architecture being a versatile and easy-to-use tool, which will be useful to transform protein data in valuable knowledge even for people not familiarized with ML code. This platform was tested in several applications comparing with results from literature. Here, we illustrate its applicability in two cases studies: the prediction of antimicrobial peptides and the prediction of enzymes Enzyme commission (EC) numbers. Furthermore, we assess the performance of the different descriptors on four different protein classification challenges. Its source code and documentation, including an user guide and case studies are freely available at https://github.com/BioSystemsUM/propythia.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
美有姬发布了新的文献求助10
3秒前
34秒前
34秒前
黄花菜完成签到 ,获得积分10
37秒前
花椰菜发布了新的文献求助10
38秒前
花椰菜完成签到,获得积分10
45秒前
1分钟前
勤劳的渊思完成签到 ,获得积分10
1分钟前
13633346872完成签到,获得积分10
2分钟前
3分钟前
SciGPT应助NaveahNi采纳,获得10
3分钟前
Benhnhk21完成签到,获得积分10
3分钟前
4分钟前
wwwww完成签到,获得积分10
4分钟前
NaveahNi发布了新的文献求助10
4分钟前
NaveahNi完成签到,获得积分10
5分钟前
5分钟前
8分钟前
亚铁氰化钾完成签到,获得积分10
8分钟前
frank完成签到,获得积分10
8分钟前
西蓝花战士完成签到 ,获得积分10
8分钟前
曾经不言完成签到 ,获得积分10
8分钟前
科研通AI6.1应助Emperor采纳,获得10
8分钟前
orixero应助美有姬采纳,获得10
8分钟前
科研通AI6.2应助Emperor采纳,获得10
8分钟前
科研通AI6.3应助Emperor采纳,获得10
8分钟前
9分钟前
隐形曼青应助Emperor采纳,获得10
9分钟前
美有姬发布了新的文献求助10
9分钟前
淡然的博涛应助Emperor采纳,获得10
9分钟前
美有姬完成签到,获得积分10
9分钟前
科研通AI6.2应助Emperor采纳,获得10
9分钟前
科研通AI6.3应助Emperor采纳,获得10
9分钟前
科研通AI6.4应助Emperor采纳,获得10
9分钟前
科研通AI6.1应助Emperor采纳,获得10
9分钟前
科研通AI6.2应助Emperor采纳,获得10
9分钟前
科研通AI6.4应助Emperor采纳,获得10
9分钟前
科研通AI6.1应助Emperor采纳,获得10
9分钟前
科研通AI6.2应助Emperor采纳,获得10
9分钟前
科研通AI6.3应助Emperor采纳,获得10
9分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
晶种分解过程与铝酸钠溶液混合强度关系的探讨 8888
Les Mantodea de Guyane Insecta, Polyneoptera 2000
The Organometallic Chemistry of the Transition Metals 800
Leading Academic-Practice Partnerships in Nursing and Healthcare: A Paradigm for Change 800
Signals, Systems, and Signal Processing 610
The formation of Australian attitudes towards China, 1918-1941 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6418779
求助须知:如何正确求助?哪些是违规求助? 8238334
关于积分的说明 17501996
捐赠科研通 5471681
什么是DOI,文献DOI怎么找? 2890844
邀请新用户注册赠送积分活动 1867570
关于科研通互助平台的介绍 1704608