Multi-layer features ablation of BERT model and its application in stock trend prediction

计算机科学 滑动窗口协议 编码器 变压器 人工智能 库存(枪支) 语言模型 机器学习 窗口(计算) 万维网 机械工程 物理 量子力学 电压 工程类 操作系统
作者
Feng Zhao,Xinning Li,Yating Gao,Ying Li,Zhiquan Feng,Caiming Zhang
出处
期刊:Expert Systems With Applications [Elsevier BV]
卷期号:207: 117958-117958
标识
DOI:10.1016/j.eswa.2022.117958
摘要

Stock comments published by experts are important references for accurate stock trends prediction. How to comprehensively and accurately capture the topic of expert stock comments is an important issue which belongs to text classification . The Bidirectional Encoder Representations from Transformers (BERT) pretrained language model is widely used for text classification , due to its high identification accuracy. However, BERT has some limitations. First , it only utilizes fixed length text, leading to suboptimal performance in long text information exploration. Second , it only relies on the features extracted from the last layer, resulting in incomprehensive classification features. To tackle these issues, we propose a multi-layer features ablation study of BERT model for accurate identification of stock comments’ themes. Specifically, we firstly divide the original text to meet the length requirement of the BERT model based on sliding window technology. In this way, we can enlarge the sample size which is beneficial for reducing the over-fitting problem. At the same time, by dividing the long text into multiple short texts, all the information of the long text can be comprehensively captured through the synthesis of the subject information of multiple short texts. In addition , we extract the output features of each layer in the BERT model and apply the ablation strategy to extract more effective information in these features. Experimental results demonstrate that compared with non-intercepted comments, the topic recognition accuracy is improved by intercepting stock comments based on sliding window technology. It proves that intercepting text can improve the performance of text classification. Compared with the BERT, the multi-layer features ablation study we present in the paper further improves the performance in the topic recognition of stock comments, and can provide reference for the majority of investors. Our study has better performance and practicability on stock trend prediction by stock comments topic recognition.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
zahlkorper发布了新的文献求助10
刚刚
陶醉的小甜瓜完成签到,获得积分10
2秒前
6秒前
ATOM完成签到,获得积分10
6秒前
Micheallee完成签到,获得积分10
8秒前
于归故城完成签到,获得积分10
8秒前
maggiexjl完成签到,获得积分10
8秒前
有魅力的桐完成签到 ,获得积分10
8秒前
JY完成签到,获得积分10
10秒前
小唐完成签到,获得积分10
10秒前
8R60d8应助blackbody采纳,获得10
10秒前
ATOM发布了新的文献求助50
11秒前
CJY完成签到 ,获得积分10
13秒前
可爱紫文完成签到 ,获得积分10
13秒前
DoyoUdo完成签到 ,获得积分10
14秒前
在水一方应助ATOM采纳,获得20
18秒前
阿然完成签到,获得积分10
18秒前
达达完成签到,获得积分10
18秒前
bcl完成签到,获得积分10
19秒前
激昂的如柏完成签到,获得积分10
19秒前
wmc1357完成签到,获得积分10
23秒前
aaa完成签到,获得积分20
23秒前
张群完成签到,获得积分10
25秒前
Hg铜完成签到,获得积分10
27秒前
litongkk完成签到 ,获得积分10
27秒前
诸葛烤鸭完成签到,获得积分10
28秒前
tmobiusx完成签到,获得积分10
31秒前
HY完成签到 ,获得积分10
31秒前
糖宝完成签到,获得积分10
32秒前
嘻嘻完成签到 ,获得积分0
33秒前
细心的语蓉完成签到,获得积分10
33秒前
菌了个菇完成签到 ,获得积分10
33秒前
aaa关闭了aaa文献求助
34秒前
安容完成签到 ,获得积分10
37秒前
38秒前
Jasper应助科研通管家采纳,获得10
38秒前
JamesPei应助科研通管家采纳,获得10
38秒前
笑点低歌曲完成签到,获得积分10
41秒前
tian完成签到,获得积分10
41秒前
42秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Development Across Adulthood 800
Chemistry and Physics of Carbon Volume 18 800
The Organometallic Chemistry of the Transition Metals 800
The formation of Australian attitudes towards China, 1918-1941 640
Signals, Systems, and Signal Processing 610
天津市智库成果选编 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6444843
求助须知:如何正确求助?哪些是违规求助? 8258667
关于积分的说明 17592041
捐赠科研通 5504555
什么是DOI,文献DOI怎么找? 2901598
邀请新用户注册赠送积分活动 1878561
关于科研通互助平台的介绍 1718178