自然语言处理
计算机科学
语言学
人工智能
心理学
哲学
作者
Weize Chen,Han Xu,Yankai Lin,Kan He,Rong Xie,Jie Zhou,Zhiyuan Liu,Maosong Sun
出处
期刊:IEEE/ACM transactions on audio, speech, and language processing
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-13
标识
DOI:10.1109/taslp.2024.3407575
摘要
In recent years, we have witnessed significant improvements in pre-trained language models (PLM) brought about by the scaling of parameter sizes and data amounts. However, this also brings high computational and storage costs. In this paper, we present a new direction to improve PLMs without scaling parameters and data: adopting a geometric feature space that is more suitable for encoding the intrinsic structured features of text. Although text is generally considered unstructured data, it possesses rich intrinsic structured features that signify syntactic and semantic relationships. Leveraging these structured features is vital for text understanding. Given that structured features are better encoded in hyperbolic spaces than in the Euclidean spaces used by conventional PLMs, we propose that PLMs should operate entirely within hyperbolic spaces. Our experiments demonstrate the superiority of hyperbolic PLMs over Euclidean PLMs across a wide variety of tasks, using the same parameter and data settings. This suggests that altering the geometry of model representation is a promising direction for model enhancement. The code is released at https://github.com/thunlp/hyperbolic_llm
科研通智能强力驱动
Strongly Powered by AbleSci AI