主观性
计算机科学
词典
自然语言处理
论证理论
语言学
情绪分析
语料库语言学
人工智能
语料库
认识论
哲学
作者
Katharina Ehret,Maite Taboada
标识
DOI:10.1177/1461445620966923
摘要
This paper brings together cutting-edge, quantitative corpus methodologies and discourse analysis to explore the relationship between text complexity and subjectivity as descriptive features of opinionated language. We are specifically interested in how text complexity and markers of subjectivity and argumentation interact in opinionated discourse. Our contributions include the marriage of quantitative approaches to text complexity with corpus linguistic methods for the study of subjectivity, in addition to large-scale analyses of evaluative discourse. As our corpus, we use the Simon Fraser Opinion and Comments Corpus (SOCC), which comprises approximately 10,000 opinion articles and the corresponding reader comments from the Canadian online newspaper The Globe and Mail, as well as a parallel corpus of hard news articles also sampled from The Globe and Mail. Methodologically, we combine conditional inference trees with the analysis of random forests, an ensemble learning technique, to investigate the interplay between text complexity and subjectivity. Text complexity is defined in terms of Kolmogorov complexity, that is, the complexity of a text is measured based on its description length. In this approach, texts which can be described more efficiently are considered to be linguistically less complex. Thus, Kolmogorov complexity is a measure of structural surface redundancy. Our take on subjectivity is inspired by research in evaluative language, stance and Appraisal and defined as the expression of evaluation and opinion in language. Drawing on a sentiment analysis lexicon and the literature on stance markers, a custom set of subjectivity and argumentation markers is created. The results show that complexity can be a powerful tool in the classification of text into different text types, and that stance adverbials serve as distinctive features of subjectivity in online news comments.
科研通智能强力驱动
Strongly Powered by AbleSci AI