计算机科学
人工智能
机器学习
随机森林
朴素贝叶斯分类器
深度学习
集成学习
决策树
大数据
随机梯度下降算法
预处理器
支持向量机
数据挖掘
人工神经网络
作者
Sajib Kumar Das,Muhammad Anwarul Azim,Abu Nowshed Chy,Mohammad Khairul Islam,Niladree Datta
标识
DOI:10.1109/iceeict62016.2024.10534583
摘要
Cybertrolling is the act of inciting and attacking someone's emotions on a social networking platform, which occurs all over the world, including Bangladesh. Many big data applications are interested in identifying trolls from tweets, which is a challenging task. It is equally crucial to ensure the safety of social networking sites against cybertrolling. Only automated identification c an prevent trolling since human moderation is slow, costly, and even impractical for rapidly expanding data. Most of the previous state-of-the-art work done to overcome this problem was based on machine learning, deep learning and transformer-based models, where the authors' work did not focus much on appropriate text preprocessing techniques, which led to subpar method performance. In this paper, we investigated the performance of statistical machine learning and deep learning algorithms with extensive preprocessing techniques and statistical features to bridge the gap of earlier research work on the publicly available dataset titled 'Tweets dataset for Detection of Cyber- Trolls' to distinguish between troll tweets and non-troll tweets. For machine learning, we used random forest, decision tree, stochastic gradient descent, multinomial naive Bayes, linear SVC, and logistic regression algorithms, as well as LSTM and CNN for deep learning. Then, an ensemble classification was also implemented by combining the best three classifiers based on majority voting. The comparative analysis demonstrated that multinomial naive Bayes reached an Fl-score of 95 %, which gives better results compared to other models because of an ensemble of preprocessing techniques with statistical features.
科研通智能强力驱动
Strongly Powered by AbleSci AI