编码
计算机科学
计算生物学
转录因子
相似性(几何)
抄写(语言学)
序列(生物学)
人工智能
基因
数据挖掘
模式识别(心理学)
生物
遗传学
图像(数学)
语言学
哲学
作者
Zhihua Du,Tianqiang Huang,Vladimir N. Uversky,Jianqiang Li
标识
DOI:10.1109/tcbb.2022.3199758
摘要
Transcription factors (TFs) are DNA binding proteins involved in the regulation of gene expression. They exist in all organisms and activate or repress transcription by binding to specific DNA sequences. Traditionally, TFs have been identified by experimental methods that are time-consuming and costly. In recent years, various computational methods have been developed to identify TF to overcome these limitations. However, there is a room for further improvement in the predictive performance of these tools in terms of accuracy. We report here a novel computational tool, TFnet, that provides accurate and comprehensive TF predictions from protein sequences. The accuracy of these predictions is substantially better than the results of the existing TF predictors and methods. Especially, it outperforms comparable methods significantly when sequence similarity to other known sequences in the database drops below 40%. Ablation tests reveal that the high predictive performance stems from innovative ways used in TFnet to derive sequence Position-Specific Scoring Matrix (PSSM) and encode inputs.
科研通智能强力驱动
Strongly Powered by AbleSci AI