摘要
The Asian social networking market dominates the world landscape with the highest consumer penetration rate. Businesses and investors often look for winning strategies to attract consumers to increase revenues from sales, advertisements, and other services offered on social media platforms. Social media engagement and online relational cohesion have often been defined within the frameworks of social psychology and personality identification is a possible way in which social psychology can inform, engage, and learn from social media. Personality profiling has many real-world applications, including preference-based recommendation systems, relationship building, and career counseling. This research puts forward a novel kernel-based soft-voting ensemble model for personality detection from natural language, KBSVE-P. The KBSVE-P model is built by first evaluating the performance of various Support Vector Machine (SVM) kernels, namely radial basis function (RBF), linear, sigmoidal, and polynomial, to find the best-suited kernel for automatic personality detection in natural language text. Next, an ensemble of SVM kernels is implemented with a variety of voting techniques, such as soft voting, hard voting, and weighted hard voting. The model is evaluated on the publicly available Kaggle_MBTI dataset and a novel South Asian, Indian, low-resource Hindi language _MBTI (pronounced as vishesh charitr, meaning personality in Hindi) dataset for detecting a user's personality across four personality traits, namely introvert/extrovert (IE), thinking/feeling (TF), sensing/intuitive (SI), and judging/perceiving (JP). The proposed kernel-based ensemble with soft voting, KBSVE-P, outperforms the existing models on English Kaggle-MBTI dataset with an average F-score of 85.677 and achieves an accuracy of 66.89 for the Hindi _MBTI dataset.