A Threshold-Triggered Deep Q-Network-Based Framework for Self-Healing in Autonomic Software-Defined IIoT-Edge Networks

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Stochastic disruptions such as flash events arising from benign traffic bursts and switch thermal fluctuations are major contributors to intermittent service degradation in software-defined industrial networks. These events violate IEC 61850-derived quality of service requirements and user-defined service-level agreements, hindering the reliable and timely delivery of control, monitoring, and best-effort traffic in IEC 61400-25-compliant wind power plants. Failure to maintain these requirements often results in delayed or lost control signals, reduced operational efficiency, and increased risk of wind turbine generator downtime. To address these challenges, this study proposes a threshold-triggered Deep Q-Network self-healing agent that autonomically detects, analyzes, and mitigates network disruptions while adapting routing behavior and resource allocation in real time. The proposed agent was trained, validated, and tested on an emulated tri-clustered switch network deployed in a cloud-based proof-of-concept testbed. Simulation results show that the proposed agent improves disruption recovery performance by 53.84% compared to a baseline shortest-path and load-balanced routing approach, and outperforms state-of-the-art methods, including the Adaptive Network-based Fuzzy Inference System by 13.1% and the Deep Q-Network and Traffic Prediction-based Routing Optimization method by 21.5%, in a super-spine leaf data-plane architecture. Additionally, the agent maintains switch thermal stability by proactively initiating external rack cooling when required. These findings highlight the potential of deep reinforcement learning in building resilience in software-defined industrial networks deployed in mission-critical, time-sensitive application scenarios.

Original languageEnglish
Number of pages15
JournalIEEE Transactions on Network and Service Management
DOIs
Publication statusE-pub ahead of print - 24 Dec 2025

Bibliographical note

Publisher Copyright:
© 2004-2012 IEEE.

Keywords

  • Agentic AI
  • ASHRAE
  • Autonomic Networking
  • DQN
  • IEC 61400-25
  • IEC 61850
  • Intents
  • NFV
  • Offshore Wind
  • Quality of Service
  • Resilience
  • SDN
  • Self-healing
  • Thermal Model

Fingerprint

Dive into the research topics of 'A Threshold-Triggered Deep Q-Network-Based Framework for Self-Healing in Autonomic Software-Defined IIoT-Edge Networks'. Together they form a unique fingerprint.

Cite this