Abstract
Rubin’s Rules are commonly used to pool the results of statistical analyses across imputed samples when using multiple imputation. Rubin’s Rules cannot be used when the result of an analysis in an imputed dataset is not a statistic and its associated standard error, but a test statistic (e.g. Student’s t-test). While complex methods have been proposed for pooling test statistics across imputed samples, these methods have not been implemented in many popular statistical software packages. The median p-value method has been proposed for pooling test statistics. The statistical significance level of the pooled test statistic is the median of the associated p-values across the imputed samples. We evaluated the performance of this method with nine statistical tests: Student’s t-test, Wilcoxon Rank Sum test, Analysis of Variance, Kruskal-Wallis test, the test of significance for Pearson’s and Spearman’s correlation coefficient, the Chi-squared test, the test of significance for a regression coefficient from a linear regression and from a logistic regression. For each test, the empirical type I error rate was higher than the advertised rate. The magnitude of inflation increased as the prevalence of missing data increased. The median p-value method should not be used to assess statistical significance across imputed datasets.
Original language | English |
---|---|
Journal | Journal of Applied Statistics |
DOIs | |
Publication status | E-pub ahead of print - 25 Oct 2024 |
Bibliographical note
Publisher Copyright:© 2024 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
Funding
ICES is an independent, non-profit research institute funded by an annual grant from the Ontario Ministry of Health (MOH) and the Ministry of Long-Term Care (MLTC). As a prescribed entity under Ontario's privacy legislation, ICES is authorized to collect and use health care data for the purposes of health system analysis, evaluation and decision support. Secure access to these data is governed by policies and procedures that are approved by the Information and Privacy Commissioner of Ontario. The use of the data in this project is authorized under section 45 of Ontario's Personal Health Information Protection Act (PHIPA) and does not require review by a Research Ethics Board. This document used data adapted from the Statistics Canada Postal CodeOM Con- version File, which is based on data licensed from Canada Post Corporation, and/or data adapted from the Ontario Ministry of Health Postal Code Conversion File, which contains data copied under license from (c) Canada Post Corporation and Statistics Canada. Parts of this material are based on data and/or information compiled and provided by CIHI and the Ontario Ministry of Health. The opinions, results and conclusions reported in this paper are those of the authors and are independent from the Funding sources. No endorsement by ICES or the Ontario MOH or MLTC is intended or should be inferred. This research was supported by operating grant from the Canadian Institutes of Health Research (CIHR) (PJT 166161).
Funders | Funder number |
---|---|
Ontario Ministry of Health (MOH) | |
Ministry of Long-Term Care (MLTC) | |
Canadian Institutes of Health Research (CIHR) | PJT 166161 |
Keywords
- hypothesis testing
- Missing data
- multiple imputation
- Rubin’s Rules