Evaluating the median p-value method for assessing the statistical significance of tests when using multiple imputation

Peter C. Austin*, Iris Eekhout, Stef van Buuren

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Rubin’s Rules are commonly used to pool the results of statistical analyses across imputed samples when using multiple imputation. Rubin’s Rules cannot be used when the result of an analysis in an imputed dataset is not a statistic and its associated standard error, but a test statistic (e.g. Student’s t-test). While complex methods have been proposed for pooling test statistics across imputed samples, these methods have not been implemented in many popular statistical software packages. The median p-value method has been proposed for pooling test statistics. The statistical significance level of the pooled test statistic is the median of the associated p-values across the imputed samples. We evaluated the performance of this method with nine statistical tests: Student’s t-test, Wilcoxon Rank Sum test, Analysis of Variance, Kruskal-Wallis test, the test of significance for Pearson’s and Spearman’s correlation coefficient, the Chi-squared test, the test of significance for a regression coefficient from a linear regression and from a logistic regression. For each test, the empirical type I error rate was higher than the advertised rate. The magnitude of inflation increased as the prevalence of missing data increased. The median p-value method should not be used to assess statistical significance across imputed datasets.

Original languageEnglish
JournalJournal of Applied Statistics
DOIs
Publication statusE-pub ahead of print - 25 Oct 2024

Bibliographical note

Publisher Copyright:
© 2024 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.

Funding

ICES is an independent, non-profit research institute funded by an annual grant from the Ontario Ministry of Health (MOH) and the Ministry of Long-Term Care (MLTC). As a prescribed entity under Ontario's privacy legislation, ICES is authorized to collect and use health care data for the purposes of health system analysis, evaluation and decision support. Secure access to these data is governed by policies and procedures that are approved by the Information and Privacy Commissioner of Ontario. The use of the data in this project is authorized under section 45 of Ontario's Personal Health Information Protection Act (PHIPA) and does not require review by a Research Ethics Board. This document used data adapted from the Statistics Canada Postal CodeOM Con- version File, which is based on data licensed from Canada Post Corporation, and/or data adapted from the Ontario Ministry of Health Postal Code Conversion File, which contains data copied under license from (c) Canada Post Corporation and Statistics Canada. Parts of this material are based on data and/or information compiled and provided by CIHI and the Ontario Ministry of Health. The opinions, results and conclusions reported in this paper are those of the authors and are independent from the Funding sources. No endorsement by ICES or the Ontario MOH or MLTC is intended or should be inferred. This research was supported by operating grant from the Canadian Institutes of Health Research (CIHR) (PJT 166161).

FundersFunder number
Ontario Ministry of Health (MOH)
Ministry of Long-Term Care (MLTC)
Canadian Institutes of Health Research (CIHR)PJT 166161

    Keywords

    • hypothesis testing
    • Missing data
    • multiple imputation
    • Rubin’s Rules

    Fingerprint

    Dive into the research topics of 'Evaluating the median p-value method for assessing the statistical significance of tests when using multiple imputation'. Together they form a unique fingerprint.

    Cite this