诗歌
计算机科学
文言文
中国诗歌
构造(python库)
中国古典诗歌
主题(计算)
人工智能
文学类
自然语言处理
语言学
哲学
艺术
万维网
程序设计语言
作者
Jiaqi Zhao,Ting Bai,Yuting Wei,Bin Wu
出处
期刊:Communications in computer and information science
日期:2022-01-01
卷期号:: 369-384
被引量:5
标识
DOI:10.1007/978-981-19-8991-9_26
摘要
Classical Chinese poetry has a history of thousands of years and is a precious cultural heritage of humankind. Compared with the modern Chinese corpus, it is irrecoverable and specially organized, making it difficult to be learned by existing pre-trained language models. Besides, with the thousands of years of development, many words in classical Chinese poetry have changed their meanings or been out of use today, which further limiting the capability of existing pre-trained models to learn the semantics of classical Chinese poetry. To address these challenges, we construct a large-scale sememe knowledge graph of classical Chinese Poetry (SKG-Poetry), which connects the vocabularies in classical Chinese poetry and modern Chinese. By extracting the sememe knowledge from classical Chinese poetry, our model PoetryBERT not only enlarges the irrecoverable pre-training corpus but also enriches the semantics of the vocabularies in classical Chinese poetry, which enables PoetryBERT to be successfully used in downstream tasks. Specifically, we evaluate our model in two tasks in the field of Chinese classical poetry, which are poetry theme classification and poetry-modern Chinese translation. Extensive experiments are conducted on the two tasks to show the effectiveness of sememe knowledge based pre-training model.
科研通智能强力驱动
Strongly Powered by AbleSci AI