多模态
信息融合
背景(考古学)
过程(计算)
多模式学习
人工智能
模态(人机交互)
模式
计算机科学
人机交互
数据科学
地理
万维网
考古
社会学
操作系统
社会科学
作者
Fan Yang,Bo Ning,Huaiqing Li
出处
期刊:Lecture notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
日期:2022-01-01
卷期号:: 259-268
标识
DOI:10.1007/978-3-031-23902-1_20
摘要
With the rapid development of modern science and technology, information sources have become more widely available and in more diverse forms, resulting in widespread interest in multimodal learning. With the various types of information captured by humans in understanding the world and perceiving objects, a single modality cannot provide all of the information about a specific object or phenomenon. Multimodal fusion learning opens up new avenues for tasks in deep learning, making them more scientific and human in their approach to solving many real-world problems. An important challenge confronting multimodal learning today is how to efficiently facilitate the fusion of multimodal features while retaining the integrity of modal information to reduce information loss. This paper summarizes the definition and development process of multimodality, analyzes and discusses briefly the main approaches to multimodal fusion, common models, and current specific applications, and finally discusses future development trends and research directions in the context of existing technologies.
科研通智能强力驱动
Strongly Powered by AbleSci AI