- High stakes postgraduate specialist certification examinations have considerable implications for the future careers of examinees. Medical colleges and professional boards have a social and professional responsibility to ensure their fitness for purpose. To date there is a paucity of published data about the reliability of specialist certification examinations and objective methods for improvement. Such data are needed to improve current assessment practices and sustain the international credibility of specialist certification processes. To determine the component and composite reliability of the Fellowship examination of the College of Physicians of South Africa, and identify strategies for further improvement, generalizability and multivariate generalizability theory were used to estimate the reliability of examination subcomponents and the overall reliability of the composite examination. Decision studies were used to identify strategies for improving the composition of the examination. Reliability coefficients of the component subtests ranged from 0.58 to 0.64. The composite reliability of the examination was 0.72. This could be increased to 0.8 by weighting all test components equally or increasing the number of patient encounters in the clinical component of the examination. Correlations between examination components were high, suggesting that similar parameters of competence were being assessed. This composite certification examination, if equally weighted, achieved an overall reliability sufficient for high stakes examination purposes. Increasing the weighting of the clinical component decreased the reliability. This could be rectified by increasing the number of patient encounters in the examination. Practical ways of achieving this are suggested.