PrefixCDD: Effective Online Concept Drift Detection over Event Streams using Prefix Trees

Jesús Huete, Hakim Qahtan, Marwan Hassani

Research output: Contribution to conferencePaperAcademic

Abstract

Process mining focuses on applying data mining techniques over business process data. Recently, with the improvements in sensoring, collection, and storage of event data, a big demand for both shorter mining time and adaptive models of streaming process events arose. This increased the interest in streaming process mining. Some techniques within this field attempt to identify drifts (change points) from evolving process data streams. Existing work on supervised and unsupervised-learning approaches over data streams have several limitations with regards to the nature of the drifts, the excessive storage required to store and process the stream, and the performance over real-world datasets. This paper contributes PrefixCDD, an efficient unsupervised-learning novel approach for online concept drift detection (CDD) over event streams. Our proposed approach utilizes a data structure, where the data stream components are stored in a set of prefix-trees. It transforms then the discrete data into continuous one using a Principal Component Analysis (PCA) approach over the trees. Then, ADWIN is used to focus on up-to-date information, making it appealing to work with the decaying mechanism logic behind our algorithm. Using six artificial and three real-life datasets, PrefixCDD outperforms state-of-the-art techniques in terms of detecting existing drifts of different natures, discovering them shortly after they appear, and the overall execution time.
Original languageEnglish
Pages328-333
Number of pages6
DOIs
Publication statusPublished - 2023
Event2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC) -
Duration: 26 Jun 202330 Jun 2023

Conference

Conference2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)
Period26/06/2330/06/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Keywords

  • Concept Drift Detection
  • Event Streams
  • Process Mining
  • Stream Process Mining
  • Unsupervised-Learning

Fingerprint

Dive into the research topics of 'PrefixCDD: Effective Online Concept Drift Detection over Event Streams using Prefix Trees'. Together they form a unique fingerprint.

Cite this