计算机科学
代表(政治)
卷积神经网络
人工智能
模式识别(心理学)
光学(聚焦)
特征(语言学)
判决
匹配(统计)
相似性(几何)
班级(哲学)
数学
图像(数学)
语言学
哲学
物理
统计
光学
政治
政治学
法学
作者
Xuyang Wang,Yajun Du,Danroujing Chen,Xianyong Li,Xiaoliang Chen,Yan-Li Lee,Jia Liu
标识
DOI:10.1016/j.eswa.2023.120124
摘要
Prototypical network is a key algorithm to solve few-shot problems. Previous prototypical network based methods average sentence embeddings of the same class to obtain corresponding class representation.1 However, this simple averaging fails to model the importance of word-level information to class representation effectively, thus limit the quality of prototype. In this work, we propose a 3D CNN2 based 3D Convolution Prototypical Network (3DCPN) which is mainly composed by two parts. To focus more effectively on the importance of word-level information from prototype perspective, firstly, we use a 3D CNN to process word embeddings of the same class. 3D CNNs are skilled at capturing semantic correlation from multiple objects. We utilize 3D CNNs to replace averaging to generate better class representation. Secondly, we construct a 2D semantic mining layer as the second part in 3DCPN to extract deep feature from query embeddings. Symmetric model structure is designed to ensure feature matching between class representation and query representation. After that, we obtain the similarity between the prototype representation and the query representation by a metric function. According to the calculated similarity matrix, we introduce a temperature coefficient based cross entropy as the objective function to optimize our model. Extensive experiments are conducted on four benchmarks. The results show that our model outperforms LaSAML by 1.88% and 2.28% on Banking77 under 10-way-5-shot and 15-way-5-shot respectively. For the other baselines, 3DCPN achieves average improvements of 4.90%, 4.53% and 8.81% on Clinc150, Hwu64 and Liu57 respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI