计算机科学
背景(考古学)
自然语言处理
机器翻译
翻译(生物学)
人工智能
地理
生物化学
化学
考古
信使核糖核酸
基因
作者
Longyue Wang,Chenyang Lyu,Tianbo Ji,Zhirui Zhang,Dian Yu,Shuming Shi,Zhaopeng Tu
出处
期刊:Cornell University - arXiv
日期:2023-01-01
被引量:7
标识
DOI:10.48550/arxiv.2304.02210
摘要
Large language models (LLMs) such as ChatGPT can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks. Taking document-level machine translation (MT) as a testbed, this paper provides an in-depth evaluation of LLMs' ability on discourse modeling. The study focuses on three aspects: 1) Effects of Context-Aware Prompts, where we investigate the impact of different prompts on document-level translation quality and discourse phenomena; 2) Comparison of Translation Models, where we compare the translation performance of ChatGPT with commercial MT systems and advanced document-level MT methods; 3) Analysis of Discourse Modelling Abilities, where we further probe discourse knowledge encoded in LLMs and shed light on impacts of training techniques on discourse modeling. By evaluating on a number of benchmarks, we surprisingly find that LLMs have demonstrated superior performance and show potential to become a new paradigm for document-level translation: 1) leveraging their powerful long-text modeling capabilities, GPT-3.5 and GPT-4 outperform commercial MT systems in terms of human evaluation; 2) GPT-4 demonstrates a stronger ability for probing linguistic knowledge than GPT-3.5. This work highlights the challenges and opportunities of LLMs for MT, which we hope can inspire the future design and evaluation of LLMs.We release our data and annotations at https://github.com/longyuewangdcu/Document-MT-LLM.
科研通智能强力驱动
Strongly Powered by AbleSci AI