编码器
人工智能
计算机科学
特征(语言学)
子序列
模式识别(心理学)
图像(数学)
数学
数学分析
哲学
语言学
有界函数
操作系统
作者
Dazhou Li,Xin Xu,Jia-heng Pan,Wei Gao,Shi-rui Zhang
标识
DOI:10.1021/acs.jcim.3c02082
摘要
The accurate identification and analysis of chemical structures in molecular images are prerequisites of artificial intelligence for drug discovery. It is important to efficiently and automatically convert molecular images into machine-readable representations. Therefore, in this paper, we propose an automated molecular optical image recognition model based on deep learning, called Image2InChI. Additionally, the proposed Image2InChI introduces a novel feature fusion network with attention to integrate image patch and InChI prediction. The improved SwinTransformer as an encoder and the Transformer Decoder as a decoder with patch embedding are applied to predict the image features for the corresponding InChI. The experimental results showed that the Image2InChI model achieves an accuracy of InChI (InChI acc) of 99.8%, a Morgan FP of 94.1%, an accuracy of maximum common structures (MCS acc) of 94.8%, and an accuracy of longest common subsequence (LCS acc) of 96.2%. The experiments demonstrated that the proposed Image2InChI model improves the accuracy and efficiency of molecular image recognition and provided a valuable reference about optical chemical structure recognition for InChI.
科研通智能强力驱动
Strongly Powered by AbleSci AI