潜在Dirichlet分配
计算机科学
旅游
本体论
主题模型
情报检索
注释
推荐系统
领域(数学)
过程(计算)
人工智能
数据科学
自然语言处理
机器学习
法学
纯数学
哲学
操作系统
认识论
数学
政治学
作者
Valentinus Roby Hananto,Uwe Serdült,Victor V. Kryssanov
标识
DOI:10.1145/3456172.3456211
摘要
Ontologies and knowledge models have gained more recognition because of their extensive use in recommender systems. The lack of automatic approaches in ontology engineering, however, becomes a challenge to fulfill increasing needs for such knowledge models in the field of tourism. In this study, a system for building tourism knowledge models from online reviews is proposed. The main contribution of the study is the application of topic modeling to build a knowledge model that, in turn, allows for an automated labeling process to train classifiers. Given a collection of unlabeled tourism online reviews, Latent Dirichlet Allocation (LDA) is applied to automatically label each document. Each topic discovered by LDA is labeled with one specific category, representing its semantic meaning based on an existing general ontology as a reference. These automatically labeled documents are used for classification, and the result is compared with manual annotation. Experiments on Indonesian tourism datasets showed that the automatic labeling approach using LDA provides for a precision score of 70%. In classification tasks, this approach can achieve comparable or even better classification performance than the manual labeling. The results obtained suggest that the developed system is capable of building a tourism knowledge model and providing acceptable-quality training data for the development of tourism recommender systems.
科研通智能强力驱动
Strongly Powered by AbleSci AI