异常检测
计算机科学
人工智能
数据挖掘
深度学习
嵌入
数据集
点云
模式识别(心理学)
超参数
机器学习
作者
Michaela Mašková,Matěj Zorek,Tomáš Pevný,Václav Šmídl
标识
DOI:10.1016/j.patcog.2024.110381
摘要
Detecting anomalous samples in set data is a problem attracting increased interest due to novel modalities, such as point-cloud data produced by lidars. Novel methods including those based on deep neural networks are often tuned for a single purpose prohibiting intuition of how relevant they are for another purpose or application domains. The aim of this survey is to: (i) review elementary concepts of anomaly detection of set data, (ii) identify the building blocks of deep anomaly detectors, and (iii) analyze the impact of these blocks on performance. The impact is studied in a large experimental comparison on a variety of benchmark datasets. The results reveal that the main factor determining the performance is the type of anomalies in the dataset. While deep methods embedding the whole set to a single fixed vector perform well on point cloud data, the methods embedding each feature vector independently are better for datasets from multi-instance learning. Moreover, sophisticated methods utilizing transformer blocks are frequently inferior to simple models with properly optimized hyperparameters. An independent factor in performance is the cardinality of sets, the proper treatment of which remains an open problem, as the existing analytical solution was found to be inaccurate.
科研通智能强力驱动
Strongly Powered by AbleSci AI