A ViT-AMC Network With Adaptive Model Fusion and Multiobjective Optimization for Interpretable Laryngeal Tumor Grading From Histopathological Images

计算机科学人工智能分级（工程）自适应光学模式识别（心理学）图像融合融合图像（数学）工程类天文语言学物理哲学土木工程

作者

Pan Huang,Peng He,Sukun Tian,Mingrui Ma,Peng Feng,Hualiang Xiao,Francesco Mercaldo,Antonella Santone,Jing Qin

出处

期刊：IEEE Transactions on Medical Imaging [Institute of Electrical and Electronics Engineers]
日期：2022-08-29 卷期号：42 (1): 15-28 被引量：53

链接

nih.govdoi.org

标识

DOI：10.1109/tmi.2022.3202248

摘要

The tumor grading of laryngeal cancer pathological images needs to be accurate and interpretable. The deep learning model based on the attention mechanism-integrated convolution (AMC) block has good inductive bias capability but poor interpretability, whereas the deep learning model based on the vision transformer (ViT) block has good interpretability but weak inductive bias ability. Therefore, we propose an end-to-end ViT-AMC network (ViT-AMCNet) with adaptive model fusion and multiobjective optimization that integrates and fuses the ViT and AMC blocks. However, existing model fusion methods often have negative fusion: 1). There is no guarantee that the ViT and AMC blocks will simultaneously have good feature representation capability. 2). The difference in feature representations learning between the ViT and AMC blocks is not obvious, so there is much redundant information in the two feature representations. Accordingly, we first prove the feasibility of fusing the ViT and AMC blocks based on Hoeffding's inequality. Then, we propose a multiobjective optimization method to solve the problem that ViT and AMC blocks cannot simultaneously have good feature representation. Finally, an adaptive model fusion method integrating the metrics block and the fusion block is proposed to increase the differences between feature representations and improve the deredundancy capability. Our methods improve the fusion ability of ViT-AMCNet, and experimental results demonstrate that ViT-AMCNet significantly outperforms state-of-the-art methods. Importantly, the visualized interpretive maps are closer to the region of interest of concern by pathologists, and the generalization ability is also excellent. Our code is publicly available at https://github.com/Baron-Huang/ViT-AMCNet .

求助该文献

最长约 10秒，即可获得该文献文件

A ViT-AMC Network With Adaptive Model Fusion and Multiobjective Optimization for Interpretable Laryngeal Tumor Grading From Histopathological Images

今日热心研友