Abstract
Missing data in scientific research go hand in hand with assumptions about the nature of the missingness. When dealing with missing values, a set of beliefs has to be formulated about the extent to which the observed data may also hold for the missing parts of the data. It is vital that the validity of these missingness assumptions is verified, tested, and that assumptions are adjusted when necessary. In this article, we demonstrate how observed data structures could a priori indicate whether it is likely that our beliefs about the missingness can be trusted. To this end, we simulate complete data and generate missing values according several types of MCAR, MAR, and MNAR mechanisms. We demonstrate that in scenarios where the data correlations are either low or very substantial, strictly different mechanisms yield equivalent statistical inferences. In addition, we show that the choice of quantity of scientific interest together with the distribution of the nonresponse govern the validity of the missingness assumptions.
Original language | English |
---|---|
Pages (from-to) | 1243-1258 |
Number of pages | 16 |
Journal | Sociological Methods and Research |
Volume | 50 |
Issue number | 3 |
Early online date | 2018 |
DOIs | |
Publication status | Published - 1 Aug 2021 |
Bibliographical note
Funding Information:The authors gratefully acknowledge Andrew Gelman for facilitating a variety of interesting research projects. This article is the outcome of an inspiring and pleasant visit to a beautiful city and a great team of researchers. The authors gratefully acknowledge Stef Van Buuren for reviewing a previous draft of this article. The author(s) received no financial support for the research, authorship, and/or publication of this article.
Publisher Copyright:
© The Author(s) 2018.
Keywords
- missing data methodology
- missingness assumptions
- multivariate amputation