M2FNet: Multi-granularity Feature Fusion Network for Medical Visual Question Answering

计算机科学 答疑 人工智能 关系(数据库) 特征(语言学) 粒度 特征提取 机器学习 语义学(计算机科学) 任务(项目管理) 卷积神经网络 情报检索 深度学习 自然语言处理 数据挖掘 哲学 操作系统 经济 管理 程序设计语言 语言学
作者
He Wang,Haiwei Pan,Kejia Zhang,Shuning He,Chunling Chen
出处
期刊:Lecture Notes in Computer Science 卷期号:: 141-154 被引量:4
标识
DOI:10.1007/978-3-031-20865-2_11
摘要

Medical Vision Question Answer (VQA) is a combination of medical artificial intelligence and visual question answering, which is a complex multimodal task. The purpose is to obtain accurate answers based on images and questions to assist patients in understanding their personal situations as well as to provide doctors with decision-making options. Although CV and NLP have driven great progress in medical VQA, challenges still exist in medical VQA due to the characteristics of the medical domain. First, the use of a meta-learning model for image feature extraction can accelerate the convergence of medical VQA models, but it will contain different degrees of noise, which will degrade the effectiveness of feature fusion in medical VQA, thereby affecting the accuracy of the model. Second, the currently existing medical VQA methods only mine the relation between medical images and questions from a single granularity or focus on the relation within the question, which leads to an inability to comprehensively understand the relation between medical images and questions. Thus, we propose a novel multi-granularity medical VQA model. On the one hand, we apply multiple meta-learning models and a convolutional denoising autoencoder for image feature extraction, and then optimize it using an attention mechanism. On the other hand, we propose to represent the question features at three granularities of words, phrases, and sentences, while a keyword filtering module is proposed to obtain keywords from word granularity, and then the stacked attention module with different granularities is used to fuse the question features with the image features to mine the relation from multiple granularities. Experimental results on the VQA-RAD dataset demonstrate that the proposed method outperforms the currently existing meta-learning medical VQA methods, with an overall accuracy improvement of 1.8% compared to MMQ, and it has more advantages for long questions.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

祝大家在新的一年里科研腾飞
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
2秒前
研友_VZG7GZ应助玉婷采纳,获得10
2秒前
王栋完成签到,获得积分10
7秒前
云雨完成签到 ,获得积分10
9秒前
恰同学少年完成签到,获得积分10
10秒前
10秒前
12秒前
叽里呱啦完成签到 ,获得积分10
12秒前
周杰伦完成签到,获得积分10
17秒前
18秒前
19秒前
ranranran完成签到,获得积分10
20秒前
21秒前
萌新发布了新的文献求助10
23秒前
深情安青应助cyx-buct采纳,获得10
25秒前
hhh发布了新的文献求助10
26秒前
星辰大海应助Hubert采纳,获得10
29秒前
Zp完成签到,获得积分10
31秒前
35秒前
cyx-buct发布了新的文献求助10
39秒前
40秒前
sutharsons应助科研通管家采纳,获得30
43秒前
Hubert完成签到,获得积分10
43秒前
吃猫的鱼完成签到 ,获得积分10
44秒前
调研昵称发布了新的文献求助30
46秒前
惊鸿客完成签到 ,获得积分10
46秒前
玉婷发布了新的文献求助10
47秒前
红黄蓝完成签到 ,获得积分10
47秒前
sdgasdca发布了新的文献求助10
50秒前
50秒前
cyx-buct完成签到,获得积分10
51秒前
共享精神应助zz采纳,获得10
51秒前
56秒前
58秒前
Yang22完成签到,获得积分10
59秒前
yn发布了新的文献求助10
1分钟前
Luo完成签到,获得积分10
1分钟前
卷大喵完成签到,获得积分10
1分钟前
qiqiqiqiqi完成签到 ,获得积分10
1分钟前
共享精神应助yn采纳,获得10
1分钟前
高分求助中
Востребованный временем 2500
Production Logging: Theoretical and Interpretive Elements 2000
Kidney Transplantation: Principles and Practice 1000
The Restraining Hand: Captivity for Christ in China 500
The Collected Works of Jeremy Bentham: Rights, Representation, and Reform: Nonsense upon Stilts and Other Writings on the French Revolution 320
Encyclopedia of Mental Health Reference Work 300
脑血管病 300
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 细胞生物学 免疫学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3371605
求助须知:如何正确求助?哪些是违规求助? 2989724
关于积分的说明 8736923
捐赠科研通 2673046
什么是DOI,文献DOI怎么找? 1464306
科研通“疑难数据库(出版商)”最低求助积分说明 677484
邀请新用户注册赠送积分活动 668822