变压器
计算机科学
二次方程
可视化
时间序列
人工智能
序列(生物学)
数据挖掘
算法
机器学习
数学
工程类
电压
生物
电气工程
遗传学
几何学
作者
Wei Li,Xiangxu Meng,Chuhan Chen,Hailin Mi,Huiqiang Wang
标识
DOI:10.1109/smc53992.2023.10394310
摘要
Long Sequence Time-Series Forecasting (LSTF) is an important and challenging research with broad applications. Recent studies have shown that Transformer-based models can be effective in solving correlation problems in time-series data, but they also introduce issues of quadratic time and memory complexity, which make them unsuitable for LSTF problems. As a response, we investigate the impact of the long-tail distribution of attention scores on prediction accuracy and propose a Bis-Attention mechanism based on the mean measurement to bi-directionally sparse the self-attention matrix as a way to enhance the differentiation of attention scores and to reduce the complexity of the Transformer-based models from $O(L^{2})$ to $O((logL)^{2})$ . Moreover, we reduce memory consumption and optimize the model architecture through the use of a shared-QK method. The effectiveness of the proposed method is verified by theoretical analysis and visualisation. Extensive experiments on three benchmarks demonstrate that our method achieves better performance compared to other state-of-the-art methods, including an average reduction of 19.2% in MSE and 12% in MAE compared to Informer.
科研通智能强力驱动
Strongly Powered by AbleSci AI