计算机科学
滑动窗口协议
编码器
变压器
人工智能
库存(枪支)
语言模型
机器学习
窗口(计算)
万维网
机械工程
物理
量子力学
电压
工程类
操作系统
作者
Feng Zhao,Xinning Li,Yating Gao,Ying Li,Zhiquan Feng,Caiming Zhang
标识
DOI:10.1016/j.eswa.2022.117958
摘要
Stock comments published by experts are important references for accurate stock trends prediction. How to comprehensively and accurately capture the topic of expert stock comments is an important issue which belongs to text classification . The Bidirectional Encoder Representations from Transformers (BERT) pretrained language model is widely used for text classification , due to its high identification accuracy. However, BERT has some limitations. First , it only utilizes fixed length text, leading to suboptimal performance in long text information exploration. Second , it only relies on the features extracted from the last layer, resulting in incomprehensive classification features. To tackle these issues, we propose a multi-layer features ablation study of BERT model for accurate identification of stock comments’ themes. Specifically, we firstly divide the original text to meet the length requirement of the BERT model based on sliding window technology. In this way, we can enlarge the sample size which is beneficial for reducing the over-fitting problem. At the same time, by dividing the long text into multiple short texts, all the information of the long text can be comprehensively captured through the synthesis of the subject information of multiple short texts. In addition , we extract the output features of each layer in the BERT model and apply the ablation strategy to extract more effective information in these features. Experimental results demonstrate that compared with non-intercepted comments, the topic recognition accuracy is improved by intercepting stock comments based on sliding window technology. It proves that intercepting text can improve the performance of text classification. Compared with the BERT, the multi-layer features ablation study we present in the paper further improves the performance in the topic recognition of stock comments, and can provide reference for the majority of investors. Our study has better performance and practicability on stock trend prediction by stock comments topic recognition.
科研通智能强力驱动
Strongly Powered by AbleSci AI