Statistical inference based on randomly generated auxiliary variables

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

In most real life studies, auxiliary variables are available and are employed to explain and understand missing data patterns and to evaluate and control causal relationships with variables of interest. Usually their availability is assumed to be a fact, even if the variables are measured without the objectives of the study in mind. As a result, inference with missing data and causal inference require some assumptions that cannot easily be validated or checked. In this paper, a framework is constructed in which auxiliary variables are treated as a selection, possibly random, from the universe of variables on a population. This framework provides conditions to make statistical inference beyond the traces of bias or effects found by the auxiliary variables themselves. The utility of the framework is demonstrated for the analysis and reduction of non‐response in surveys. However, the framework may be more generally used to understand the strength of association between variables. Important roles are played by the diversity and diffusion of the population of interest, features that are defined in the paper and the estimation of which is discussed.
Original languageEnglish
Pages (from-to)33-56
Number of pages23
JournalJournal of the Royal Statistical Society. Series B
Volume80
Issue number1
DOIs
Publication statusPublished - Jan 2018

Keywords

  • Causal inference
  • Independent variable
  • Missing data
  • Non-response; Surveys

Fingerprint

Dive into the research topics of 'Statistical inference based on randomly generated auxiliary variables'. Together they form a unique fingerprint.

Cite this