Abstract
Dutch water authorities are responsible for, among others, the management of water levels in waterways. To perform their task properly, it is important that data is of high quality. We compare several univariate and multivariate methods for real time outlier detection in time series data of water sensors from Dutch water authority “Aa en Maas”. Their performance is assessed by measuring how well they detect simulated spike, jump and drift outliers. This approach allowed us to uncover the outlier parameter values (i.e. drift or jump magnitude) at which certain detection thresholds are reached. The experiments show that the outliers are best detected by multivariate (as opposed to univariate) models, and that a multi-layer perceptron quantile regression (QR-MLP) model is best able to capture these multivariate relations. In addition to simulated outliers, the QR-MLP model is able to detect real outliers as well. Moreover, specific rules for each outlier category are not needed. In sum, QR-MLP models are well-suited to detect outliers without supervision.
Original language | English |
---|---|
Title of host publication | Advanced Analytics and Learning on Temporal Data |
Subtitle of host publication | AALTD 2020 |
Editors | V. Lemaire, S. Malinowski, A. Bagnall, T. Guyet, R. Tavenard, G. Ifrim |
Publisher | Springer |
Pages | 155-170 |
Number of pages | 16 |
Volume | 12588 |
ISBN (Electronic) | 978-3-030-65742-0 |
ISBN (Print) | 978-3-030-65741-3 |
DOIs | |
Publication status | Published - 2020 |
Keywords
- Outlier detection
- Time series
- Quantile regression
- Synthetic evaluation
- Machine learning