Background. Preference-based measures of health-related quality of life all use the same dead = 0.00 to perfect health = 1.00 scale, but there are substantial differences among measures. Objective. The objective was to examine agreement in classifying patients as better, stable, or worse. Methods. The EQ-5D, Health Utilities Index Mark 2 and Mark 3, Quality of Well-Being–Self-Administered scale, Short-Form 36 (Short-Form 6D), and disease-targeted measures were administered prospectively in 2 clinical cohorts. The study was conducted at academic medical centers: University of California, Los Angeles; University of California, San Diego; University of Wisconsin–Madison; and University of Southern California. Patients undergoing cataract extraction surgery with lens replacement completed the 25-item National Eye Institute Visual Function Questionnaire (NEI-VFQ-25). Patients newly referred to congestive heart failure specialty clinics completed the Minnesota Living with Heart Failure Questionnaire (MLHF). In both cohorts, subjects completed surveys at baseline and at 1 and 6 months. The NEI-VFQ-25 and MLHF were used as gold standards to assign patients to categories of change. Agreement was assessed using κ. Results. There were 376 cataract patients recruited. Complete data for baseline and the 1-month follow-up were available on all measures for 210 cases. Using criteria specified by Altman, agreement was poor for 6 of 9 pairs of comparisons and fair for 3 pairs. There were 160 heart failure patients recruited. Complete data for baseline and the 6-month follow-up were available for 86 cases. Agreement was negligible for 5 pairs and fair for 1. The study was conducted on selected patients at a few academic medical centers. Conclusions. The results underscore the lack of interchangeability among different preference-based measures.