计算机科学
情态动词
人工智能
自然语言处理
材料科学
高分子化学
作者
Hongchen Xue,Qingzhi Ma,Guanfeng Liu,Jianfeng Qu,Yuanjun Liu,An Liu
标识
DOI:10.1145/3627673.3679668
摘要
The automatic generation of radiological imaging reports aims to produce accurate and coherent clinical descriptions based on X-ray images. This facilitates clinicians in completing the arduous task of report writing and advances clinical automation. The primary challenge in radiological imaging report generation lies in accurately capturing and describing abnormal regions in the images under data bias conditions, resulting in the generation of lengthy texts containing image details. Existing methods mostly rely on prior knowledge such as medical knowledge graphs, corpora, and image databases to assist models in generating more precise textual descriptions. However, these methods still struggle to identify rare anomalies in the images. To address this issue, we propose a two-stage training model, named CLR2G, based on cross-modal contrastive learning. This model delegates the task of capturing anomalies, particularly those challenging for the generative model trained with cross-entropy loss under data bias conditions, to a specialized abnormality capture component. Specifically, we employ a semantic matching loss function to train additional abnormal image and text encoders through cross-modal contrastive learning, facilitating the capture of 13 common anomalies. We utilize the anomalous image features, text features and their confidence probabilities as a posteriori knowledge to help the model generate accurate image reports. Experimental results demonstrate the state-of-the-art performance of our method on two widely used public datasets, IU-Xray and MIMIC-CXR.
科研通智能强力驱动
Strongly Powered by AbleSci AI