Knowledge-Enhanced Medical Visual Question Answering: A Survey (Invited Talk Summary)
计算机科学
情报检索
答疑
万维网
数据科学
作者
Haofen Wang,Huifang Du
出处
期刊:Communications in computer and information science日期:2023-01-01卷期号:: 3-9
标识
DOI:10.1007/978-981-99-1354-1_1
摘要
Medical Visual Question Answering (Med-VQA) is a task in the field of Artificial Intelligence where a medical image is given with a related question, and the task is to provide an accurate answer to the question. It involves the integration of computer vision, natural language processing, and medical domain knowledge. Furthermore, incorporating medical knowledge in Med-VQA can improve the reasoning ability and accuracy of the answers. While knowledge-enhanced Visual Question Answering (VQA) in the general domain has been widely researched, medical VQA requires further examination due to its unique features. In the paper, we gather information on and analyze the current publicly accessible Med-VQA datasets with external knowledge. We also critically review the key technologies combined with knowledge in Med-VQA tasks in terms of the advancements and limitations. Finally, we discuss the existing challenges and future directions for Med-VQA.