文字2vec
支持向量机
计算机科学
特征提取
朴素贝叶斯分类器
社会化媒体
人工智能
分类器(UML)
机器学习
模式识别(心理学)
特征(语言学)
召回
心理学
万维网
语言学
哲学
嵌入
认知心理学
作者
Cangnai Fang,Gracia Dianatobing,Talia Atara,Ivan Sebastian Edbert,Derwin Suhartono
标识
DOI:10.1109/icicos56336.2022.9930596
摘要
The high number of depressed people and the fatal effect it can cause raise the urgency to detect a depressed person as soon as possible. Social media as a platform to express oneself can help us do this job. By properly extracting user-created content in social media, we can detect those who are depressed. This paper compares four feature extraction methods to find the best one. The combinations of TF-IDF, LIWC, Word2Vec, and weighted Word2Vec paired with Naïve Bayes or Linear Support Vector Machine classifiers are used on the Reddit Mental Health dataset. Word2Vec paired with the SVM classifier proved to be the best combination with 95.68% accuracy, 92.58% precision, 93.10% recall, and 92.84% F1- score. However, the weighted Word2Vec failed to improve the performance of averaging the basic Word2Vec, obtaining only 95.15% accuracy. Another finding is that SVM performed better than NB for classification though it takes significantly longer to train. This experiment shows that choosing suitable feature extraction methods will benefit the model's performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI