Bayesian models for data missing not at random in health examination surveys


In epidemiological surveys, data missing not at random (MNAR) due to survey nonresponse may potentially lead to a bias in the risk factor estimates. We propose an approach based on Bayesian data augmentation and survival modelling to reduce the nonresponse bias. The approach requires additional information based on follow-up data. We present a case study of smoking prevalence using FINRISK data collected between 1972 and 2007 with a follow-up to the end of 2012 and compare it to other commonly applied missing at random (MAR) imputation approaches. A simulation experiment is carried out to study the validity of the approaches. Our approach appears to reduce the nonresponse bias substantially, whereas MAR imputation was not successful in bias reduction.

Statistical Modelling, 18(2), pp. 113-128

Supplementary notes can be added here, including code and math.

Bayesian estimation Data augmentation follow-up data health examination surveys multiple imputation survival analysis
Juho Kopra
University Lecturer of Statistics

My research interests include Bayesian statistical methods, applied statistics for problems with high societal impact.