Mapping soil thickness by accounting for right-censored data with survival probabilities and machine learning

  • Stephan van der Westhuizen*
  • , Gerard B.M. Heuvelink
  • , David P. Hofmeyr
  • , Laura Poggio
  • , Madlene Nussbaum
  • , Colby Brungard
  • *Corresponding author for this work

    Research output: Contribution to journalArticleAcademicpeer-review

    Abstract

    In digital soil mapping, modelling soil thickness poses a challenge due to the prevalent issue of right-censored data. This means that the true soil thickness exceeds the depth of sampling, and neglecting to account for the censored nature of the data can lead to poor model performance and underestimation of the true soil thickness. Survival analysis is a well-established domain of statistical modelling that can deal with censored data. The random survival forest is a notable example of a survival-related machine learning approach used to address right-censored soil property data in digital soil mapping. Previous studies that employed this model either focused on mapping the probability of soil thickness exceeding certain depths, and thereby not mapping soil thickness itself, or dismissed it due to perceived poor performance. In this study, we propose an alternative survival model to map soil thickness that is based on the inverse probability of censoring weighting. In this approach, calibration data are weighted by the inverse of the probability that soil thickness exceeds a certain depth, that is, a survival probability. These weights can then be used with most machine learning models. We used the weights with a regular random forest, and compared it with a random survival forest, and other strategies for handling right-censored data, through a comprehensive synthetic simulation study and two real-world case studies. The results suggest that the weighted random forest model produces competitive predictions, establishing it as a viable option for mapping right-censored soil property data.

    Original languageEnglish
    Article numbere13589
    Number of pages19
    JournalEuropean Journal of Soil Science
    Volume75
    Issue number5
    DOIs
    Publication statusPublished - 1 Sept 2024

    Bibliographical note

    Publisher Copyright:
    © 2024 The Author(s). European Journal of Soil Science published by John Wiley & Sons Ltd on behalf of British Society of Soil Science.

    Keywords

    • digital soil mapping
    • inverse probability of censoring weighting
    • random survival forest
    • soil depth
    • survival analysis

    Fingerprint

    Dive into the research topics of 'Mapping soil thickness by accounting for right-censored data with survival probabilities and machine learning'. Together they form a unique fingerprint.

    Cite this