时间戳
计算机科学
修剪
区间(图论)
系列(地层学)
时间序列
实时计算
数据质量
数据挖掘
算法
数学
工程类
运营管理
组合数学
机器学习
农学
生物
古生物学
公制(单位)
作者
Chenguang Fang,Shaoxu Song,Yinan Mei
出处
期刊:Proceedings of the VLDB Endowment
[VLDB Endowment]
日期:2022-05-01
卷期号:15 (9): 1848-1860
被引量:9
标识
DOI:10.14778/3538598.3538607
摘要
Time series data are often with regular time intervals, e.g., in IoT scenarios sensor data collected with a pre-specified frequency, air quality data regularly recorded by outdoor monitors, and GPS signals periodically received from multiple satellites. However, due to various issues such as transmission latency, device failure, repeated request and so on, timestamps could be dirty and lead to irregular time intervals. Amending the irregular time intervals has obvious benefits, not only improving data quality but also leading to more accurate applications such as frequency-domain analysis and more effective compression in storage. The timestamp repairing problem however is challenging, given many interacting factors to determine, including the time interval, the start timestamp, the series length, as well as the matching between the time series before and after repairing. Our contributions in this paper are (1) formalizing the timestamp repairing problem for regular interval time series to minimize the cost w.r.t. move, insert and delete operations; (2) devising an exact approach with advanced pruning strategies based on lower bounds of repairing; (3) proposing an approximation based on bi-directional dynamic programming. The experimental results demonstrate the superiority of our proposal in both timestamp repair accuracy and the aforesaid applications. Remarkably, the repair results can be used to evaluate time series data quality measures. Both the repair and measure functions have been implemented in an open-source time series database, Apache IoTDB.
科研通智能强力驱动
Strongly Powered by AbleSci AI