计算机科学
机器翻译
词(群论)
编码器
变压器
自然语言处理
人工智能
翻译(生物学)
背景(考古学)
语音识别
语言学
信使核糖核酸
操作系统
物理
哲学
基因
古生物学
生物化学
电压
化学
生物
量子力学
作者
Dong Seog Han,Junhui Li,Yachao Li,Min Zhang,Guodong Zhou
出处
期刊:ACM Transactions on Asian and Low-Resource Language Information Processing
日期:2019-07-23
卷期号:19 (1): 1-17
被引量:8
摘要
In this article, we show that word translations can be explicitly incorporated into NMT effectively to avoid wrong translations. Specifically, we propose three cross-lingual encoders to explicitly incorporate word translations into NMT: (1) Factored encoder, which encodes a word and its translation in a vertical way; (2) Gated encoder, which uses a gated mechanism to selectively control the amount of word translations moving forward; and (3) Mixed encoder, which stitchingly learns a word and its translation annotations over sequences where words and their translations are alternatively mixed. Besides, we first use a simple word dictionary approach and then a word sense disambiguation (WSD) approach to effectively model the word context for better word translation. Experimentation on Chinese-to-English translation demonstrates that all proposed encoders are able to improve the translation accuracy for both traditional RNN-based NMT and recent self-attention-based NMT (hereafter referred to as Transformer ). Specifically, Mixed encoder yields the most significant improvement of 2.0 in BLEU on the RNN-based NMT, while Gated encoder improves 1.2 in BLEU on Transformer . This indicates the usefulness of an WSD approach in modeling word context for better word translation. This also indicates the effectiveness of our proposed cross-lingual encoders in explicitly modeling word translations to avoid wrong translations in NMT. Finally, we discuss in depth how word translations benefit different NMT frameworks from several perspectives.
科研通智能强力驱动
Strongly Powered by AbleSci AI