匹配(统计)
计算机科学
情报检索
文档聚类
聚类分析
领域(数学)
自然语言处理
人工智能
数学
统计
纯数学
作者
Yaokai Cheng,Ruoyu Chen,Xiaoguang Yuan,Yuting Yang,Shan Jiang,Bo Yang
出处
期刊:Journal of physics
[IOP Publishing]
日期:2022-01-01
卷期号:2171 (1): 012059-012059
标识
DOI:10.1088/1742-6596/2171/1/012059
摘要
Abstract Long-form document matching is an important direction in the field of natural language processing and can be applied to tasks such as news recommendation and text clustering. However, long-form document matching suffers from noisiness and sparsity of semantic information in long text. Using short-form document matching methods on a long-form matching problem is not satisfactory. Long-form document matching has attracted the attention of researchers, who have proposed many effective methods. Methods for matching long texts can be divided into three categories: traditional bag-of-words-based models, traditional deep learning-based models, and pre-training-based models. This study reviews typical methods of long-form document matching, analyzes their advantages and disadvantages, and discusses possible future developments.
科研通智能强力驱动
Strongly Powered by AbleSci AI