Special Communication |

Missing Data and Multiple Imputation

Peter Cummings, MD, MPH1
[+] Author Affiliations
1Department of Epidemiology and Harborview Injury Prevention and Research Center, University of Washington, Seattle.
JAMA Pediatr. 2013;167(7):656-661. doi:10.1001/jamapediatrics.2013.1329.
Text Size: A A A
Published online

Missing data can result in biased estimates of the association between an exposure X and an outcome Y. Even in the absence of bias, missing data can hurt precision, resulting in wider confidence intervals. Analysts should examine the missing data pattern and try to determine the causes of the missingness. Modern software has simplified multiple imputation of missing data and the analysis of multiply imputed data to the point where this method should be part of any analyst’s toolkit. Multiple imputation will often, but not always, reduce bias and increase precision compared with complete-case analysis. Some exceptions to this rule are noted in this review. When describing study results, authors should disclose the amount of missing data and other details. Investigators should consider how to minimize missing data when planning a study.

Figures in this Article

Sign In to Access Full Content

Don't have Access?

Register and get free email Table of Contents alerts, saved searches, PowerPoint downloads, CME quizzes, and more

Subscribe for full-text access to content from 1998 forward and a host of useful features

Activate your current subscription (AMA members and current subscribers)

Purchase Online Access to this article for 24 hours


Place holder to copy figure label and caption
Figure 1.
Distribution of Risk Ratio Estimates From Complete-Case Analyses (Solid Line) and After Multiple Imputation (Dashed Line) in 2000 Simulations Using Data From Table 1

Data sets were created with data missing at random for 50% of speed data and 20% of seat belt use data among drivers who survived, data missing completely at random for 25% of speed data and 20% of seat belt data among all drivers, and data missing not at random for both seat belt use and death among 30 drivers who used seat belts and died. The true adjusted risk ratio for death among belted vs unbelted drivers was 0.500, shown by a vertical line. The distributions of the risk ratio estimates were smoothed by using a kernel density method.

Graphic Jump Location
Place holder to copy figure label and caption
Figure 2.
Scatterplot From 2000 Simulations of Table 1 Data

Data for speed were missing completely at random (MCAR) in 25% of the records, and data about seat belt use were MCAR for 20%. Vertical and horizontal lines indicate the true risk ratio of 0.500. The solid diagonal line indicates identical values for both risk ratios, and the dashed diagonal line is from linear regression, with complete-case risk ratios as the explanatory variable and multiple-imputation risk ratios as the outcome. This regression line and the 2000 plotted points both show that the risk ratios produced by multiple imputation are usually closer to 0.500 than those produced by complete-case analysis.

Graphic Jump Location




Meets CME requirements for:
Browse CME for all U.S. States
Accreditation Information
The American Medical Association is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians. The AMA designates this journal-based CME activity for a maximum of 1 AMA PRA Category 1 CreditTM per course. Physicians should claim only the credit commensurate with the extent of their participation in the activity. Physicians who complete the CME course and score at least 80% correct on the quiz are eligible for AMA PRA Category 1 CreditTM.
Note: You must get at least of the answers correct to pass this quiz.
You have not filled in all the answers to complete this quiz
The following questions were not answered:
Sorry, you have unsuccessfully completed this CME quiz with a score of
The following questions were not answered correctly:
Commitment to Change (optional):
Indicate what change(s) you will implement in your practice, if any, based on this CME course.
Your quiz results:
The filled radio buttons indicate your responses. The preferred responses are highlighted
For CME Course: A Proposed Model for Initial Assessment and Management of Acute Heart Failure Syndromes
Indicate what changes(s) you will implement in your practice, if any, based on this CME course.
Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s "Cited By" API will populate this tab (http://www.crossref.org/citedby.html).
Submit a Comment


Some tools below are only available to our subscribers or users with an online account.

Web of Science® Times Cited: 1

Sign In to Access Full Content

Related Content

Customize your page view by dragging & repositioning the boxes below.

See Also...
Articles Related By Topic
Related Topics
PubMed Articles