Abstract
Reinforcement learning (RL) has achieved remarkable progress in recent years, yet its application in real-world tasks is hindered by poor sample efficiency, especially in structurally complex environments. This thesis investigates how structural knowledge, including subtask composition, symbolic reasoning, communication structure and agent influence can be exploited to improve the efficiency of single-agent and multi-agent RL algorithms.
First, we introduce a hierarchical RL framework that automatically structures subtasks. By jointly learning high-level subtask selection and low-level subtask execution, the method achieves superior performance in sparse-reward environments. Second, we propose a neuro-symbolic RL framework that integrates probabilistic symbolic reasoning with policy learning. By introducing a probabilistic inference modular to calculate action precondition masks, the framework excludes infeasible actions via symbolic knowledge, thereby improving both sample efficiency and policy safety. Third, we present a multi-agent RL framework that exploits communication structure through decentralized scheduling of sparse communication. Agents learn when to share local messages by predicting others’ messages, leading to improved performance with reduced communication overhead. Finally, we design a multi-agent RL framework, which automatically identifies the state dimensions controllable by each agent. This structural insight enables focused exploration and precise credit assignment in cooperative multi-agent scenarios with sparse rewards.
Together, these contributions advance the sample efficiency of RL by systematically exploiting structural knowledge in decision-making processes. The results across diverse domains demonstrate that the proposed methods outperform state-of-the-art baselines.
| Original language | English |
|---|---|
| Qualification | Doctor of Philosophy |
| Awarding Institution |
|
| Supervisors/Advisors |
|
| Award date | 24 Mar 2026 |
| Place of Publication | Utrecht |
| Publisher | |
| DOIs | |
| Publication status | Published - 24 Mar 2026 |
Keywords
- Reinforcement learning
- Multi-agent reinforcement learning
- sample efficiency
- subtask composition
- neuro-symbolic learning
- action mask
- communication
- exploration
- credit assignment
Fingerprint
Dive into the research topics of 'Improving sample efficiency of reinforcement learning: Exploiting structural knowledge for decision making'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver