Timezone: »
Given a long list of anomaly detection algorithms developed in the last few decades, how do they perform with regard to (i) varying levels of supervision, (ii) different types of anomalies, and (iii) noisy and corrupted data? In this work, we answer these key questions by conducting (to our best knowledge) the most comprehensive anomaly detection benchmark with 30 algorithms on 57 benchmark datasets, named ADBench. Our extensive experiments (98,436 in total) identify meaningful insights into the role of supervision and anomaly types, and unlock future directions for researchers in algorithm selection and design. With ADBench, researchers can easily conduct comprehensive and fair evaluations for newly proposed methods on the datasets (including our contributed ones from natural language and computer vision domains) against the existing baselines. To foster accessibility and reproducibility, we fully open-source ADBench and the corresponding results.
Author Information
Songqiao Han (Shanghai University of Finance and Economics)
Xiyang Hu (Carnegie Mellon University)
Hailiang Huang (Shanghai University of Finance & Economics)
Minqi Jiang (SUFE AI Lab)
Yue Zhao (Carnegie Mellon University)
I am pursuing a Ph.D. in Information Systems at Carnegie Mellon University, advised by Prof. Leman Akoglu. Different from most IS researchers, I focus on data mining algorithms, systems, and applications. Research Keywords: Outlier & Anomaly Detection; Ensemble Learning; Scalable Machine Learning; Machine Learning Systems.
More from the Same Authors
-
2021 : Revisiting Time Series Outlier Detection: Definitions and Benchmarks »
Kwei-Herng Lai · Daochen Zha · Junjie Xu · Yue Zhao · Guanchu Wang · Xia Hu -
2021 : Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development »
Kexin Huang · Tianfan Fu · Wenhao Gao · Yue Zhao · Yusuf Roohani · Jure Leskovec · Connor Coley · Cao Xiao · Jimeng Sun · Marinka Zitnik -
2022 Poster: BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs »
Kay Liu · Yingtong Dou · Yue Zhao · Xueying Ding · Xiyang Hu · Ruitong Zhang · Kaize Ding · Canyu Chen · Hao Peng · Kai Shu · Lichao Sun · Jundong Li · George H Chen · Zhihao Jia · Philip S Yu -
2021 Poster: Automatic Unsupervised Outlier Model Selection »
Yue Zhao · Ryan Rossi · Leman Akoglu -
2019 Poster: Optimal Sparse Decision Trees »
Xiyang Hu · Cynthia Rudin · Margo Seltzer -
2019 Spotlight: Optimal Sparse Decision Trees »
Xiyang Hu · Cynthia Rudin · Margo Seltzer