计算机科学
人工智能
机器学习
变压器
源代码
特征(语言学)
特征提取
工程类
语言学
哲学
电压
电气工程
操作系统
作者
Yihan Dong,Xiaowen Hu,Zhijian Huang,Lei Deng
标识
DOI:10.1109/bibm58861.2023.10385506
摘要
The escalating severity of antibiotic resistance poses substantial challenges across diverse sectors, encompassing everyday life, agriculture, and clinical medical interventions. Conventional methods for investigating antibiotic resistance genes (ARGs), such as culture-based techniques and whole-genome sequencing, often suffer from demands of time, labor, and limited accuracy. Moreover, the fragmented nature of existing datasets hampers a comprehensive analysis of antibiotic resistance gene sequences. In this study, we introduce an innovative computational framework known as TGC-ARG, designed to predict potential ARGs. TGC-ARG harnesses protein sequences as input, retrieves protein structures through SCRATCH-1D, and employs a feature extraction module to deduce feature representations for both protein sequences and structures. Subsequently, we integrate a siamese network to establish a contrastive learning paradigm, thus augmenting the model’s representational capabilities. The resultant sequence embeddings and structure embeddings are merged and directed into a Multilayer Perceptron (MLP) for predicting ARG presence. To assess the performance, we curate a pioneering publicly available dataset named ARSS (Antibiotic Resistance Sequence Statistics). Our extensive comparative experimental outcomes underscore the superiority of our approach over the current state-of-the-art (SOTA) methodology. Furthermore, through comprehensive case analyses, we demonstrate the efficacy of our approach in predicting potential ARGs. The dataset and source code are accessible at https://github.com/angel1gel/TGC-ARG.
科研通智能强力驱动
Strongly Powered by AbleSci AI