Objective
To examine the extent to which performance assessment methods affect the percentage of neonatal intensive care units (NICUs) and very low-birth-weight (VLBW) infants included in performance assessments, the distribution of NICU performance ratings, and the level of agreement in those ratings.
Design
Cross-sectional study based on risk-adjusted nosocomial infection rates.
Setting
NICUs belonging to the California Perinatal Quality Care Collaborative 2007-2008.
Participants
One hundred twenty-six California NICUs and 10 487 VLBW infants.
Main Exposures
Three performance assessment choices: (1) excluding “low-volume” NICUs (those caring for <30 VLBW infants per year) vs a criterion based on confidence intervals, (2) using Bayesian vs frequentist hierarchical models, and (3) pooling data across 1 vs 2 years.
Main Outcome Measures
Proportion of NICUs and patients included in quality assessment, distribution of ratings for NICUs, and agreement between methods using the κ statistic.
Results
Depending on the methods applied, 51% to 85% of NICUs and 72% to 96% of VLBW infants were included in performance assessments, 76% to 87% of NICUs were considered “average,” and the level of agreement between NICU ratings ranged from 0.23 to 0.89.
Conclusions
The percentage of NICUs included in performance assessments and their ratings can shift dramatically depending on performance measurement method. Physicians, payers, and policymakers should continue to closely examine which existing performance assessment methods are most appropriate for evaluating pediatric care quality.