计算机科学
关系(数据库)
嵌入
人工智能
服装
特征(语言学)
透视图(图形)
空间关系
模式识别(心理学)
任务(项目管理)
图像(数学)
机器学习
数据挖掘
语言学
哲学
管理
考古
经济
历史
作者
Shumin Zhu,Xingxing Zou,Jianjun Qian,Wai Keung Wong
标识
DOI:10.1109/tmm.2023.3284593
摘要
Fashion attribute recognition is a not-new topic, but rather a core task in understanding fashion from the perspective of computer vision. This paper proposes a structured relation-aware network (sRA-Net), which exploits multiple hidden relations in fashion images to enrich and achieve accurate attribute representations to boost the performance of fashion attribute recognition. Specifically, it deconstructs the features of a clothing fashion item into three levels, including low-level attribute-related image region information, mid-level attribute dependency information, and high-level clothing look information. To learn these multi-relational embeddings, we present three relation-aware attention mechanisms. The attribute attention mechanism describes the relationship among different attribute vectors through self-attention and uses the attention map to update the attribute embedding. Then, the spatial attention mechanism associates the attribute with the image features and enhances the attribute embedding by leveraging the attribute-related image region. Finally, the channel attention mechanism selects attribute-related image feature channels to obtain a more fine-grained attribute embedding. Furthermore, we introduce structure-aware embedding to constrain attribute recognition in images from a global perspective by identifying the inner structure of the clothing. Without bells and whistles, sRA-Net outperforms all state-of-the-art attribute recognition methods on two mainstream fashion attribute datasets, namely the DeepFashion-C dataset and iFashion-Attribute dataset, with over 1%-3% improvement.
科研通智能强力驱动
Strongly Powered by AbleSci AI