Can automated machine translation evaluation metrics be used to assess students’ interpretation in the language learning classroom?

计算机科学机器翻译 NIST公司自然语言处理人工智能操作化人气机器翻译评价语言习得分级（工程）公制（单位）能力（人力资源）数学教育心理学机器翻译软件可用性社会心理学哲学运营管理土木工程认识论基于实例的机器翻译工程类经济

作者

Chao Han,Xiaolei Lu

出处

期刊：Computer Assisted Language Learning [Routledge]
日期：2021-08-28 卷期号：36 (5-6): 1064-1087 被引量：16

标识

DOI：10.1080/09588221.2021.1968915

摘要

The use of translation and interpreting (T&I) in the language learning classroom is commonplace, serving various pedagogical and assessment purposes. Previous utilization of T&I exercises is driven largely by their potential to enhance language learning, whereas the latest trend has begun to underscore T&I as a crucial skill to be acquired as part of transcultural competence for language learners and future language users. Despite their growing popularity and utility in the language learning classroom, assessing T&I is time-consuming, labor-intensive and cognitively taxing for human raters (e.g., language teachers), primarily because T&I assessment entails meticulous evaluation of informational equivalence between the source-language message and target-language renditions. One possible solution is to rely on automated quality metrics that are originally developed to evaluate machine translation (MT). In the current study, we investigated the viability of using four automated MT evaluation metrics, BLEU, NIST, METEOR and TER, to assess human interpretation. Essentially, we correlated the automated metric scores with the human-assigned scores (i.e., the criterion measure) from multiple assessment scenarios to examine the degree of machine-human parity. Overall, we observed fairly strong metric-human correlations for BLEU (Pearson's r = 0.670), NIST (r = 0.673) and METEOR (r = 0.882), especially when the metric computation was conducted on the sentence level rather than the text level. We discussed these emerging findings and others in relation to the feasibility of operationalizing MT metrics to evaluate students' interpretation in the language learning classroom.Supplemental data for this article is available online at https://doi.org/10.1080/09588221.2021.1968915 .

求助该文献

最长约 10秒，即可获得该文献文件

Can automated machine translation evaluation metrics be used to assess students’ interpretation in the language learning classroom?

今日热心研友