计算机科学
搜索引擎索引
时间序列
SQL语言
波动性(金融)
数据挖掘
系列(地层学)
财务
数据库
情报检索
机器学习
生物
经济
古生物学
作者
Tom Bamford,Andrea Coletta,Elizabeth Fons,S. Gopalakrishnan,Svitlana Vyetrenko,Tucker Balch,Manuela Veloso
标识
DOI:10.1145/3604237.3626901
摘要
Financial firms commonly process and store billions of time-series data, generated continuously and at a high frequency. To support efficient data storage and retrieval, specialized time-series databases and systems have emerged. These databases support indexing and querying of time-series by a constrained Structured Query Language(SQL)-like format to enable queries like "Stocks with monthly price returns greater than 5%", and expressed in rigid formats. However, such queries do not capture the intrinsic complexity of high dimensional time-series data, which can often be better described by images or language (e.g., "A stock in low volatility regime"). Moreover, the required storage, computational time, and retrieval complexity to search in the time-series space are often non-trivial. In this paper, we propose and demonstrate a framework to store multi-modal data for financial time-series in a lower-dimensional latent space using deep encoders, such that the latent space projections capture not only the time series trends but also other desirable information or properties of the financial time-series data (such as price volatility). Moreover, our approach allows user-friendly query interfaces, enabling natural language text or sketches of time-series, for which we have developed intuitive interfaces. We demonstrate the advantages of our method in terms of computational efficiency and accuracy on real historical data as well as synthetic data, and highlight the utility of latent-space projections in the storage and retrieval of financial time-series data with intuitive query modalities.
科研通智能强力驱动
Strongly Powered by AbleSci AI