计算机科学
自然语言处理
人工智能
情绪分析
文字嵌入
嵌入
词(群论)
德国的
聚类分析
词典
语言学
哲学
作者
Yuelei Xu,Wanze Du,Lirong Hu
标识
DOI:10.1007/978-3-031-44693-1_7
摘要
Unsupervised cross-lingual word embedding (CLWE) aligns monolingual embedding spaces without parallel corpora or bilingual dictionaries. However, it is unclear whether CLWE models tuned for different corpora alignment can perform well across NLP tasks. Previous researches have shown that unsupervised CLWE tends to make words have close word embedding distributions with similar syntactic but opposite sentiment polarity, leading to poor performance in cross-lingual sentiment analysis (CLSA). This work proposes an Unsupervised Cross-lingual Sentiment word Embedding (UCSentiE) model, which eliminates both linguistic and sentiment gap between two languages. UCSentiE leverages the priori sentiment information of source language and integrates them into CLWE without compromising cross-lingual word semantics. We evaluate UCSentiE on two NLP tasks across six languages (English, Chinese, German, Japanese, French and Spanish). Experimental results demonstrate UCSentiE’s stability in bilingual lexicon induction (BLI) and its superiority over unsupervised VecMap and supervised MUSE models in CLSA, with average F1 score improvements of about 6.53% and 2.23%, respectively. Visualization and clustering analysis further validates our approach’s effectiveness. Code is available at https://github.com/dwzgit/UCSentiE.git.
科研通智能强力驱动
Strongly Powered by AbleSci AI