计算机科学
文字嵌入
任务(项目管理)
情报检索
词(群论)
人工智能
嵌入
社会化媒体
自然语言处理
万维网
语言学
哲学
经济
管理
作者
Na Wei,Shengchuan Zhao,Jing Liu,Shenghui Wang
标识
DOI:10.1016/j.elerap.2022.101143
摘要
Mining user-generated content on e-commerce platforms and social media is timely and more objective compared with other information access channels for gaining competitive intelligence. Identifying comparative text from large volumes of non-comparative text is an important but challenging task. On one hand, existing methods are time-consuming and not generalizable across different domains. On the other hand, the datasets for the task generally suffer from the severe imbalance issue. To address abovementioned problems, we propose a framework adopting advanced deep learning methods to automatically learn features and a novel textual data augmentation method named TA3S to deal with the data imbalance issue. Specifically, the TA3S method simultaneously considers the syntactic structure and semantic information of comparative text samples. Moreover, in order to support the successful implementation of TA3S, we develop a novel method based on word embedding and label propagation algorithm to distinguish between synonymous and antonymous substitute words. The experiments on two real-world datasets demonstrate the feasibility and effectiveness of our framework, and present that our framework outperforms state-of-the-art methods in identifying comparative text from user-generated content.
科研通智能强力驱动
Strongly Powered by AbleSci AI