计算机科学
数据流挖掘
异常检测
架空(工程)
离群值
数据流
特征(语言学)
维数之咒
特征向量
流算法
数据挖掘
噪音(视频)
钥匙(锁)
算法
人工智能
数学
图像(数学)
上下界
电信
操作系统
语言学
数学分析
哲学
计算机安全
作者
Emaad Manzoor,Hemank Lamba,Leman Akoglu
标识
DOI:10.1145/3219819.3220107
摘要
This work addresses the outlier detection problem for feature-evolving streams, which has not been studied before. In this setting both (1) data points may evolve, with feature values changing, as well as (2) feature space may evolve, with newly-emerging features over time. This is notably different from row-streams, where points with fixed features arrive one at a time. We propose a density-based ensemble outlier detector, called xStream, for this more extreme streaming setting which has the following key properties: (1) it is a constant-space and constant-time (per incoming update) algorithm, (2) it measures outlierness at multiple scales or granularities, it can handle (3 i ) high-dimensionality through distance-preserving projections, and (3$ii$) non-stationarity via $O(1)$-time model updates as the stream progresses. In addition, xStream can address the outlier detection problem for the (less general) disk-resident static as well as row-streaming settings. We evaluate xStream rigorously on numerous real-life datasets in all three settings: static, row-stream, and feature-evolving stream. Experiments under static and row-streaming scenarios show that xStream is as competitive as state-of-the-art detectors and particularly effective in high-dimensions with noise. We also demonstrate that our solution is fast and accurate with modest space overhead for evolving streams, on which there exists no competition.
科研通智能强力驱动
Strongly Powered by AbleSci AI