自然语言处理
仿形(计算机编程)
计算机科学
机器翻译
桥接(联网)
人工智能
语言学
语料库语言学
作者归属
归属
文体学
计算语言学
心理学
哲学
操作系统
社会心理学
计算机网络
作者
George Mikros,Dimitris Boumparis
出处
期刊:Digital Scholarship in the Humanities
[Oxford University Press]
日期:2024-06-05
卷期号:39 (3): 954-967
摘要
Abstract This study explores the feasibility of cross-linguistic authorship attribution and the author’s gender identification using Machine Translation (MT). Computational stylistics experiments were conducted on a Greek blog corpus translated into English using Google’s Neural MT. A Random Forest algorithm was employed for authorship and gender profiling, using different feature groups [Author’s Multilevel N-gram Profiles, quantitative linguistics (QL), and cross-lingual word embeddings (CLWE)] in both original and translated texts. Results indicate that MT is a viable method for converting a multilingual corpus into one language for authorship attribution and gender profiling research, with considerable accuracy when training and testing datasets use identical language. In the pure cross-linguistic scenario, higher accuracies than the baselines were obtained using CLWE and QL features.
科研通智能强力驱动
Strongly Powered by AbleSci AI