计算机科学
对话框
生成语法
组分(热力学)
对偶(语法数字)
背景(考古学)
人工智能
任务(项目管理)
自然语言处理
选择(遗传算法)
生成模型
对话系统
人机交互
语言学
工程类
万维网
古生物学
生物
哲学
系统工程
物理
热力学
作者
Xiaolin Chen,Xuemeng Song,Liqiang Jing,Shuo Li,Linmei Hu,Liqiang Nie
出处
期刊:ACM Transactions on Information Systems
日期:2023-10-06
卷期号:42 (2): 1-25
被引量:8
摘要
Text response generation for multimodal task-oriented dialog systems, which aims to generate the proper text response given the multimodal context, is an essential yet challenging task. Although existing efforts have achieved compelling success, they still suffer from two pivotal limitations: (1) overlook the benefit of generative pretraining and (2) ignore the textual context-related knowledge . To address these limitations, we propose a novel dual knowledge-enhanced generative pretrained language mode for multimodal task-oriented dialog systems (DKMD), consisting of three key components: dual knowledge selection , dual knowledge-enhanced context learning , and knowledge-enhanced response generation . To be specific, the dual knowledge selection component aims to select the related knowledge according to both textual and visual modalities of the given context. Thereafter, the dual knowledge-enhanced context learning component targets seamlessly, integrating the selected knowledge into the multimodal context learning from both global and local perspectives, where the cross-modal semantic relation is also explored. Moreover, the knowledge-enhanced response generation component comprises a revised BART decoder, where an additional dot-product knowledge-decoder attention sub-layer is introduced for explicitly utilizing the knowledge to advance the text response generation. Extensive experiments on a public dataset verify the superiority of the proposed DKMD over state-of-the-art competitors.
科研通智能强力驱动
Strongly Powered by AbleSci AI