计算机科学
人工智能
自然语言处理
补语(音乐)
机器翻译
语言学
判决
词汇多样性
词汇
生物化学
化学
哲学
互补
基因
表型
作者
Orphée De Clercq,Rudy Loock,Gert De Sutter,Bert Cappelle,Koen Plevoets
出处
期刊:Le Centre pour la Communication Scientifique Directe - HAL - Diderot
日期:2020-01-31
摘要
The aim of this presentation is to discuss the linguistic features of machine-translated texts in comparison with original texts in order to uncover what has been called “machine translationese” (e.g. Daems et al. 2017). Using a corpus-based statistical approach, namely, the Principal Component Analysis technique, 4 MT systems have been investigated for English to French translations of press texts: 1 Statistical MT (SMT) and 3 Neural MT (NMT) systems, namely DeepL, Google Translate, and the European Commission’s eTranslation MT tool, in both its SMT and NMT versions. In particular, to complement a previous study on language-specific features (e.g. derived adverbs, existential constructions, coordinator et, preposition avec, see Loock 2018), a series of language-independent linguistic features were extracted for each text, ranging from superficial text characteristics such as the average word and sentence length, to frequencies of closed-class lexical categories and measures of lexical diversity.The final aim is to uncover linguistic features in MT texts that clearly deviate from the expected norms in original French.
科研通智能强力驱动
Strongly Powered by AbleSci AI