PurposeOur purpose was to evaluate the characteristics of highly and poorly rated teachers as well as to assess the validity and reliability of those evaluations.MethodsWe downloaded 14 years of medicine faculty evaluations completed by 3rd and 4th year medical students. We dichotomized overall teaching effectiveness as outstanding (receiving “outstanding”) or inferior (rated as a “unsatisfactory,” “marginal,” or “acceptable”). We analyzed these using logistic regression (STATA v 18.0). We assessed validity and reliability using factor analysis, Cronbach’s alpha, and intraclass correlation coefficients.ResultsMost (57%) of the 722 faculty members were rated as outstanding. Medical students valued faculty that took advantage of opportunities to teach (OR, 3.0; 95% CI, 2.7–3.3), who were enthusiastic (OR, 2.6; 95% CI, 2.3–2.9), and clear/organized (OR, 2.5; 2.3–2.7). Faculty rarely were rated as inferior (7.7%). Among lower-rated faculty, 91% had more than one lower evaluation. Lower-rated teachers had lower ratings on most domains of evaluation including taking advantages of opportunities to teach (4.6 vs. 2.7, p < 0.0005), being clear and organized (3.0 vs. 4.6, p < 0.0005), enthusiasm (4.5 vs. 2.7, p < 0.0005), being supportive (4.5 vs. 2.5, p < 0.0005), providing feedback (4.4 vs. 2.6, p < 0.005), or clearly answering questions (4.6 vs. 3.1, p < 0.0005). While evaluations were highly consistent (Cronbach’s alpha, 0.94), there were low levels of agreement with intraclass correlation coefficients ranging from 0.09 to 0.36.ConclusionMost attendings received high ratings, while lower ratings were uncommon. Most teachers receiving lower ratings received more than one, suggesting that lower ratings may be a better discriminator of teaching effectiveness than outstanding ones. Teaching ratings had low inter-rater reliability, suggesting either low validity or that learners value different characteristics in teachers.