Integrated detection and localization of concept drifts in process mining with batch and stream trace clustering support

Rafael Gaspar de Sousa*, Antonio Carlos Meira Neto, Marcelo Fantinato*, Sarajane Marques Peres*, Hajo Alexander Reijers

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Process mining can help organizations by extracting knowledge from event logs. However, process mining techniques often assume business processes are stationary, while actual business processes are constantly subject to change because of the complexity of organizations and their external environment. Thus, addressing process changes over time – known as concept drifts – allows for a better understanding of process behavior and can provide a competitive edge for organizations, especially in an online data stream scenario. Current approaches to handling process concept drift focus primarily on detecting and locating concept drifts, often through an integrated, albeit offline, approach. However, part of these integrated approaches rely on complex data structures related to tree-based process models, usually discovered through algorithms whose results are influenced by specific heuristic rules. Moreover, most of the proposed approaches have not been tested on public true concept drift-labeled event logs commonly used as benchmark, making comparative analysis difficult. In this article, we propose an online approach to detect and localize concept drifts in an integrated way using batch and stream trace clustering support. In our approach, cluster models provide input information for both concept drift detection and localization methods. Each cluster abstracts a behavior profile underlying the process and reveals descriptive information about the discovered concept drifts. Experiments with benchmark synthetic event logs with different control-flow changes, as well as with real-world event logs, showed that our approach, when relying on the same clustering model, is competitive in relation to baselines concept drift detection method. In addition, the experiment showed our approach is able to correctly locate the concept drifts detected and allows the analysis of such concept drifts through different process behavior profiles.

Original languageEnglish
Article number102253
Number of pages33
JournalData and Knowledge Engineering
Volume149
Early online date2 Dec 2023
DOIs
Publication statusPublished - Jan 2024

Bibliographical note

Publisher Copyright:
© 2023 Elsevier B.V.

Funding

The authors thank Universidade de Sao Paulo and Amazon Web Services (AWS) for the support of this research in Cloud Manufacturing Services, in the scope of USP-AWS collaboration; the Sao Paulo Research Foundation - Fapesp-Brazil - process numbers 2017/26487-4 and 2017/26491-1.

FundersFunder number
Universidade de Sao Paulo
Amazon Web Services (AWS)
Sao Paulo Research Foundation - Fapesp-Brazil2017/26487-4, 2017/26491-1

    Keywords

    • Business processes
    • Concept drift
    • Data mining
    • Process mining
    • Trace clustering

    Fingerprint

    Dive into the research topics of 'Integrated detection and localization of concept drifts in process mining with batch and stream trace clustering support'. Together they form a unique fingerprint.

    Cite this