Multi-Grained Radiology Report Generation With Sentence-Level Image-Language Contrastive Learning

特征（语言学）计算机科学人工智能任务（项目管理）判决自然语言处理深度学习图像（数学）利用语言学哲学计算机安全管理经济

作者

Aohan Liu,Yuchen Guo,Jun‐Hai Yong,Feng Xu

出处

期刊：IEEE Transactions on Medical Imaging [Institute of Electrical and Electronics Engineers]
日期：2024-03-05 卷期号：43 (7): 2657-2669 被引量：11

链接

nih.govdoi.org

标识

DOI：10.1109/tmi.2024.3372638

摘要

The automatic generation of accurate radiology reports is of great clinical importance and has drawn growing research interest. However, it is still a challenging task due to the imbalance between normal and abnormal descriptions and the multi-sentence and multi-topic nature of radiology reports. These features result in significant challenges to generating accurate descriptions for medical images, especially the important abnormal findings. Previous methods to tackle these problems rely heavily on extra manual annotations, which are expensive to acquire. We propose a multi-grained report generation framework incorporating sentence-level image-sentence contrastive learning, which does not require any extra labeling but effectively learns knowledge from the image-report pairs. We first introduce contrastive learning as an auxiliary task for image feature learning. Different from previous contrastive methods, we exploit the multi-topic nature of imaging reports and perform fine-grained contrastive learning by extracting sentence topics and contents and contrasting between sentence contents and refined image contents guided by sentence topics. This forces the model to learn distinct abnormal image features for each specific topic. During generation, we use two decoders to first generate coarse sentence topics and then the fine-grained text of each sentence. We directly supervise the intermediate topics using sentence topics learned by our contrastive objective. This strengthens the generation constraint and enables independent fine-tuning of the decoders using reinforcement learning, which further boosts model performance. Experiments on two large-scale datasets MIMIC-CXR and IU-Xray demonstrate that our approach outperforms existing state-of-the-art methods, evaluated by both language generation metrics and clinical accuracy.

求助该文献

最长约 10秒，即可获得该文献文件

Multi-Grained Radiology Report Generation With Sentence-Level Image-Language Contrastive Learning

今日热心研友