异常检测
云计算
异常(物理)
计算机科学
统计学习
数据科学
人工智能
操作系统
物理
凝聚态物理
作者
Jordan Hochenbaum,Owen Vallis,Arun Kejariwal
出处
期刊:Cornell University - arXiv
日期:2017-01-01
被引量:76
标识
DOI:10.48550/arxiv.1704.07706
摘要
Performance and high availability have become increasingly important drivers, amongst other drivers, for user retention in the context of web services such as social networks, and web search. Exogenic and/or endogenic factors often give rise to anomalies, making it very challenging to maintain high availability, while also delivering high performance. Given that service-oriented architectures (SOA) typically have a large number of services, with each service having a large set of metrics, automatic detection of anomalies is non-trivial. Although there exists a large body of prior research in anomaly detection, existing techniques are not applicable in the context of social network data, owing to the inherent seasonal and trend components in the time series data. To this end, we developed two novel statistical techniques for automatically detecting anomalies in cloud infrastructure data. Specifically, the techniques employ statistical learning to detect anomalies in both application, and system metrics. Seasonal decomposition is employed to filter the trend and seasonal components of the time series, followed by the use of robust statistical metrics -- median and median absolute deviation (MAD) -- to accurately detect anomalies, even in the presence of seasonal spikes. We demonstrate the efficacy of the proposed techniques from three different perspectives, viz., capacity planning, user behavior, and supervised learning. In particular, we used production data for evaluation, and we report Precision, Recall, and F-measure in each case.
科研通智能强力驱动
Strongly Powered by AbleSci AI