计算机科学
人工智能
判决
特征(语言学)
发电机(电路理论)
分层数据库模型
匹配(统计)
代表(政治)
模式识别(心理学)
自然语言处理
医学
数据挖掘
病理
政治
物理
哲学
功率(物理)
法学
量子力学
语言学
政治学
作者
Xiaodan Zhang,Sisi Yang,Yanzhao Shi,Junzhong Ji,Ying Liu,Zheng Wang,Huimin Xu
标识
DOI:10.1016/j.compbiomed.2023.107650
摘要
Brain Computed Tomography (CT) report generation, which aims to assist radiologists in diagnosing cerebrovascular diseases efficiently, is challenging in feature representation for dozens of images and language descriptions with several sentences. Existing report generation methods have achieved significant achievement based on the encoder–decoder framework and attention mechanism. However, current research has limitations in solving the many-to-many alignment between the multi-images of Brain CT imaging and the multi-sentences of Brain CT report, and fails to attend to critical images and lesion areas, resulting in inaccurate descriptions. In this paper, we propose a novel Weakly Guided Attention Model with Hierarchical Interaction, named WGAM-HI, to improve Brain CT report generation. Specifically, WGAM-HI conducts many-to-many matching for multiple visual images and semantic sentences via a hierarchical interaction framework with a two-layer attention model and a two-layer report generator. In addition, two weakly guided mechanisms are proposed to facilitate the attention model to focus more on important images and lesion areas under the guidance of pathological events and Gradient-weighted Class Activation Mapping (Grad-CAM) respectively. The pathological event acts as a bridge between the essential serial images and the corresponding sentence, and the Grad-CAM bridges the lesion areas and pathology words. Therefore, under the hierarchical interaction with the weakly guided attention model, the report generator generates more accurate words and sentences. Experiments on the Brain CT dataset demonstrate the effectiveness of WGAM-HI in attending to important images and lesion areas gradually, and generating more accurate reports.
科研通智能强力驱动
Strongly Powered by AbleSci AI