How to handle missing data: A comparison of different approaches

Margot Peeters, M. A. J. Zondervan-Zwijnenburg, Gerko Vink, Rens van de Schoot*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Many researchers face the problem of missing data in longitudinal research. Especially, high risk samples are characterized by missing data which can complicate analyses and the interpretation of results. In the current study, our aim was to find the most optimal and best method to deal with the missing data in a specific study with many missing data on the outcome variable. Therefore, different techniques to handle missing data were evaluated, and a solution to efficiently handle substantial amounts of missing data was provided. A simulation study was conducted to determine the most optimal method to deal with the missing data. Results revealed that multiple imputation (MI) using predictive mean matching was the most optimal method with respect to lowest bias and the smallest confidence interval (CI) while maintaining power. Listwise deletion and last observation carried backward also scored acceptable with respect to bias; however, CIs were much larger and sample size almost halved using these methods. Longitudinal research in high risk samples could benefit from using MI in future research to handle missing data. The paper ends with a checklist for handling missing data.

Original languageEnglish
Pages (from-to)377-394
Number of pages18
JournalEuropean Journal of Developmental Psychology
Volume12
Issue number4
DOIs
Publication statusPublished - 4 Jul 2015

Keywords

  • high risk sample
  • longitudinal research
  • missing data
  • multiple imputation

Fingerprint

Dive into the research topics of 'How to handle missing data: A comparison of different approaches'. Together they form a unique fingerprint.

Cite this