计算机科学
发电机(电路理论)
人工智能
计算机断层摄影术
任务(项目管理)
模式识别(心理学)
计算机视觉
放射科
医学
量子力学
物理
经济
功率(物理)
管理
作者
Yuhao Tang,Hsiu‐Chiung Yang,Liyan Zhang,Ye Yuan
标识
DOI:10.1016/j.eswa.2023.121442
摘要
Computed Tomography Report Generation(CTRG) aims to generate medical reports towards a series of radiological images, which is an advancement of the conventional X-ray report generation (generating one medical description only based on a single X-ray snapshot). Beyond the difficulties faced in the traditional task, CTRG requires the model to filter out the lesion regions from sequential scans, producing a fine-grained report that conforms to medical logic and common sense. Limited to available datasets, there are few methods trying to tackle this task. Besides, although densely aggregating sequential features may be beneficial, it introduces extra noise. Moreover, radiology reports are long narratives composed of abnormal descriptions and template sentences, but most studies ignore this hierarchical nature and generate the entire reports uniformly. This paper aims to bridge the gap from three distinct perspectives: first, we develop two large-scale clinical datasets termed CTRG-Brain-263K and CTRG-Chest-548K, which contain 263670 brain CT scans and 548696 chest CT scans with authoritative diagnosis reports, respectively. Second, we design a self-attention-based Scan Localizer(SL) that captures a representation most reflective of the lesion area. And a reconstruction loss is introduced to minimize the distance between focused and original scans. Finally, we propose a Dynamic Generator(DG) that decouples the decoder into abnormal and template branches, with produced proposals dynamically aggregated for the final generation. Experimental results confirm the proposed SL-DG outperforms existing methods, i.e., about +5.2% and +0.4% CIDEr points on CTRG-Brain-263K and CTRG-Chest-548K, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI