Tan Xian,Ying Sun,Saijun Fan,Yanhe Wang,Jingbo Zhang,Peng Sun,Zhiqiang Ma
标识
DOI:10.1109/bibm55620.2022.9995413
摘要
HPV, a significant hazard to human health, is the primary cause of many cancers. The proteins E6 and E7 of HPV are known to damage oncogenes, but many mechanisms are still unknown. Research reveals that HPV can integrate its genome into host genes, and the integration mechanism strongly depends on the local genomic environment. Research on the integration mechanism can deepen the understanding of HPV and the development of vaccines, thus further affecting the cure of cancers and other related diseases. However, the research on HPV integration sites in silico experiments is in its infancy, and improving the model performance of HPV integration site predictors is challenging. In this work, we propose a novel deep learning model for HPV integration site prediction named DSHP. DSHP uses a variety of features of DNA sequences as input. In the 5-fold cross-validation, the ACC and AUC of DSHP are 0.914 and 0.934; in the 10-fold cross-validation, the ACC and AUC of DSHP are 0.933 and 0.941. The performance fully illustrates the effectiveness of DSHP. Moreover, our ablation experiments further explain the importance of features in the prediction process, and provide a reference for future prediction research. The data and code are available at: https://github.com/xtnenu/DSHP.