随机森林
计算机科学
加权
特征选择
机器学习
人工智能
分类器(UML)
特征(语言学)
决策树
数据挖掘
模式识别(心理学)
语言学
医学
放射科
哲学
作者
Hatoon S. AlSagri,Mourad Ykhlef
出处
期刊:International Journal of Advanced Computer Science and Applications
[The Science and Information Organization]
日期:2020-01-01
卷期号:11 (5)
被引量:11
标识
DOI:10.14569/ijacsa.2020.0110577
摘要
Feature selection based on importance is a funda-mental step in machine learning models because it serves as a vital technique to orient the use of variables to what is most efficient and effective for a given machine learning model. In this study, an explainable machine learning model based on Random forest, is built to address the problem of identification of depression level for Twitter users. This model reflects its transparency through calculating its feature importance. There are several techniques to quantify the importance of features. However, in this study, random forest is used as both a classifier, which has over-performing aspects over many classifiers such as decision trees, and a method for weighting the input features as their importance imply. In this study, the importance of features is measured using different techniques including random forest, and the results of these techniques are compared. Furthermore, feature importance uses the concept of weighting the input variables inside a complete system for recommending a solution for depressed persons. The experimental results confirm the superiority of random forest over other classifiers using three different methods for measuring the features importance. The accuracy of random forest classification reached 84.7%, and the importance of features increased the classifier accuracy to 84.9%.
科研通智能强力驱动
Strongly Powered by AbleSci AI