The potential of data driven approaches for quantifying hydrological extremes

Sandra M. Hauswirth*, Marc F.P. Bierkens, Vincent Beijk, Niko Wanders

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Recent droughts in Europe have shown that national water systems are facing increasing challenges when dealing with drought impacts. Especially the Netherlands has seen an increasing need to adapt their water management to improve preparedness for future drought events. Ideally, the necessary information needed for operational water management decisions should be readily available ahead of time and/or computed flexibly and efficiently to ensure sufficient time to evaluate the various management actions. In this study, we show that in addition to physically based hydrological models, the upcoming and promising trend of incorporating machine learning (ML) in hydrology can provide the basis for future efforts in supporting national operational water management by providing the needed information efficiently and with the required accuracy. As a precursor for their use in a forecasting system, we assessed the ability of five different data driven methods to simulate hydrological variables at a national-scale. We developed a unified workflow where we use limited information on hydro-meteorological variables and general water management policies to simulate historic timeseries of discharge, groundwater levels, surface water levels and surface water temperatures. We find that all ML methods, ranging from very simple to more complex ones, showed a generally good performance for stations and target levels which are closely linked to the input data and location (e.g. stations along main river network). For downstream stations and small rivers, the Random Forest method outperforms the other methods both for discharge and surface water levels. For surface water temperature no location dependency was observed and for groundwater levels, all methods were performing comparable with most stations ranging in nRMSE 0.2-0.3. Generally, the best performances were reached by the more advanced Random Forest and LSTM methods, which was also seen when simulating high and low flow events. High flow events were slightly better captured than low flow events but overall simulating extreme events based on a simple input data set remains challenging. Specific training sets, including event related information and additional input variables, could like improve future assessments. Including the feature importance of the methods allowed us to detect how and where water management influence played an important role. The addition of information on water management in the ML routines increases overall performance, although limited. We conclude that ML and other data driven approaches have potential in predicting different hydrological variables. We were able to capture and incorporate water management aspects in our analysis, creating a base for future experiments where scenario analysis might reveal ML based mitigation strategies. The combination of limited input data requirement and short computation times makes this new framework suitable for forecasting purposes.

Original languageEnglish
Article number104017
Pages (from-to)1-24
JournalAdvances in Water Resources
Publication statusPublished - Sept 2021


  • Droughts
  • Hydrology
  • Machine learning
  • Water management


Dive into the research topics of 'The potential of data driven approaches for quantifying hydrological extremes'. Together they form a unique fingerprint.

Cite this