Convolution-Enhanced Bi-Branch Adaptive Transformer With Cross-Task Interaction for Food Category and Ingredient Recognition

计算机科学 变压器 人工智能 成分 机器学习 卷积神经网络 模式识别(心理学) 数据挖掘 化学 食品科学 电压 物理 量子力学
作者
Yuxin Liu,Weiqing Min,Shuqiang Jiang,Yong Rui
出处
期刊:IEEE transactions on image processing [Institute of Electrical and Electronics Engineers]
卷期号:33: 2572-2586 被引量:4
标识
DOI:10.1109/tip.2024.3374211
摘要

Recently, visual food analysis has received more and more attention in the computer vision community due to its wide application scenarios, e.g., diet nutrition management, smart restaurant, and personalized diet recommendation. Considering that food images are unstructured images with complex and unfixed visual patterns, mining food-related semantic-aware regions is crucial. Furthermore, the ingredients contained in food images are semantically related to each other due to the cooking habits and have significant semantic relationships with food categories under the hierarchical food classification ontology. Therefore, modeling the long-range semantic relationships between ingredients and the categories-ingredients semantic interactions is beneficial for ingredient recognition and food analysis. Taking these factors into consideration, we propose a multi-task learning framework for food category and ingredient recognition. This framework mainly consists of a food-orient Transformer named Convolution-Enhanced Bi-Branch Adaptive Transformer (CBiAFormer) and a multi-task category-ingredient recognition network called Structural Learning and Cross-Task Interaction (SLCI). In order to capture the complex and unfixed fine-grained patterns of food images, we propose a query-aware data-adaptive attention mechanism called Bi-Branch Adaptive Attention (BiA-Attention) in CBiAFormer, which consists of a local fine-grained branch and a global coarse-grained branch to mine local and global semantic-aware regions for different input images through an adaptive candidate key/value sets assignment for each query. Additionally, a convolutional patch embedding module is proposed to extract the fine-grained features which are neglected by Transformers. To fully utilize the ingredient information, we propose SLCI, which consists of cross-layer attention to model the semantic relationships between ingredients and two cross-task interaction modules to mine the semantic interactions between categories and ingredients. Extensive experiments show that our method achieves competitive performance on three mainstream food datasets (ETH Food-101, Vireo Food-172, and ISIA Food-200). Visualization analyses of CBiAFormer and SLCI on two tasks prove the effectiveness of our method. Codes will be released upon publication.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Amy完成签到,获得积分10
2秒前
努力成为大拿完成签到,获得积分10
4秒前
1633完成签到,获得积分20
5秒前
7秒前
9秒前
田様应助叮咚采纳,获得10
12秒前
13秒前
14秒前
SYLH应助hobowei采纳,获得10
16秒前
贪玩的小蜜蜂完成签到,获得积分10
18秒前
20秒前
晕晕完成签到 ,获得积分10
22秒前
23秒前
cc发布了新的文献求助30
26秒前
添酱发布了新的文献求助10
26秒前
27秒前
yy完成签到,获得积分10
28秒前
30秒前
张XX完成签到,获得积分10
31秒前
NIHAO213发布了新的文献求助10
32秒前
33秒前
风趣安青完成签到 ,获得积分10
35秒前
38秒前
NexusExplorer应助NIHAO213采纳,获得10
38秒前
谦让寄容完成签到,获得积分10
40秒前
魁梧的小伙子完成签到,获得积分10
42秒前
Ava应助hebhm采纳,获得10
42秒前
向雨竹完成签到,获得积分10
42秒前
wyl发布了新的文献求助10
43秒前
文献荒完成签到,获得积分10
44秒前
共享精神应助vicky采纳,获得10
44秒前
mengli完成签到 ,获得积分10
45秒前
燕燕于飞发布了新的文献求助10
45秒前
可yi完成签到,获得积分10
46秒前
47秒前
nulinuli完成签到 ,获得积分10
48秒前
开心绿柳完成签到,获得积分10
49秒前
vicky完成签到,获得积分10
52秒前
52秒前
Alexbirchurros完成签到 ,获得积分10
52秒前
高分求助中
All the Birds of the World 4000
Production Logging: Theoretical and Interpretive Elements 3000
Les Mantodea de Guyane Insecta, Polyneoptera 2000
Am Rande der Geschichte : mein Leben in China / Ruth Weiss 1500
CENTRAL BOOKS: A BRIEF HISTORY 1939 TO 1999 by Dave Cope 1000
Machine Learning Methods in Geoscience 1000
Resilience of a Nation: A History of the Military in Rwanda 888
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3737277
求助须知:如何正确求助?哪些是违规求助? 3281146
关于积分的说明 10023011
捐赠科研通 2997776
什么是DOI,文献DOI怎么找? 1644825
邀请新用户注册赠送积分活动 782224
科研通“疑难数据库(出版商)”最低求助积分说明 749717