*********************** * MD Part 3: Approximate do it yourself multiple imputation version 12.1 webuse mheart0, clear * Imputation model for bmi regress bmi attack smokes age hsgrad female predict bmihat if missing(bmi) scalar rmse = e(rmse) * As shown in the output, the rmse is a little over 4. To confirm, display rmse * Impute the values for missing cases 20 times set seed 2232 gen e = 0 if !missing(bmi) * Compute 20 random error terms forval i = 1/20 { quietly gen e`i' = rnormal() if missing(bmi) } * Compute 20 imputed values for each case * Imputed value = bmihat + random variation forval i = 1/20 { quietly clonevar bmi`i' = bmi quietly replace bmi`i' = bmihat + rmse * e`i' if missing(bmi) } * Compare the imputed values, first 3 imputations list bmihat bmi bmi1 bmi2 bmi3 e1 e2 e3 if missing(bmi) * Compare the summary stats, first 3 imputations sum bmihat bmi bmi1 bmi2 bmi3 e1 e2 e3, sep(5) * Convert to mi format. We will use wide, as each case now has * one record with several imputed variables mi import wide, imputed(e = e1-e20 bmi = bmi1-bmi20) clear drop * We can now convert to the more efficient mlong format mi convert mlong, clear * Now do the analytic model mi estimate: logit attack smokes age bmi hsgrad female *********************** *** Alternative coding; no need to generate the e vars * MD Part 3: Approximate do it yourself multiple imputation version 12.1 webuse mheart0, clear * Imputation model for bmi regress bmi attack smokes age hsgrad female predict bmihat if missing(bmi) scalar rmse = e(rmse) * As shown in the output, the rmse is a little over 4. To confirm, display rmse * Impute the values for missing cases 20 times set seed 2232 * Compute 20 imputed values for each case * Imputed value = bmihat + random variation forval i = 1/20 { quietly clonevar bmi`i' = bmi quietly replace bmi`i' = bmihat + rnormal(0, rmse) if missing(bmi) } * Compare the imputed values, first 3 imputations list bmihat bmi bmi1 bmi2 bmi3 if missing(bmi) * Compare the summary stats, first 3 imputations sum bmihat bmi bmi1 bmi2 bmi3 , sep(5) * Convert to mi format. We will use wide, as each case now has * one record with several imputed variables mi import wide, imputed(bmi = bmi1-bmi20) clear drop * We can now convert to the more efficient mlong format mi convert mlong, clear * Now do the analytic model mi estimate: logit attack smokes age bmi hsgrad female