计算机科学
模式
人工智能
机器学习
蓝图
分类
深度学习
图形
理论计算机科学
数据科学
机械工程
社会科学
社会学
工程类
作者
Yasha Ektefaie,George Dasoulas,Ayush Noori,Maha Farhat,Marinka Žitnik
标识
DOI:10.1038/s42256-023-00624-6
摘要
Artificial intelligence for graphs has achieved remarkable success in modelling complex systems, ranging from dynamic networks in biology to interacting particle systems in physics. However, the increasingly heterogeneous graph datasets call for multimodal methods that can combine different inductive biases — assumptions that algorithms use to make predictions for inputs they have not encountered during training. Learning on multimodal datasets is challenging because the inductive biases can vary by data modality and graphs might not be explicitly given in the input. To address these challenges, graph artificial intelligence methods combine different modalities while leveraging cross-modal dependencies through geometric relationships. Diverse datasets are combined using graphs and fed into sophisticated multimodal architectures, specified as image-intensive, knowledge-grounded and language-intensive models. Using this categorization, we introduce a blueprint for multimodal graph learning, use it to study existing methods and provide guidelines to design new models. One of the main advances in deep learning in the past five years has been graph representation learning, which enabled applications to problems with underlying geometric relationships. Increasingly, such problems involve multiple data modalities and, examining over 160 studies in this area, Ektefaie et al. propose a general framework for multimodal graph learning for image-intensive, knowledge-grounded and language-intensive problems.
科研通智能强力驱动
Strongly Powered by AbleSci AI