A statistical analysis has shown no evidence systematic bias by examiners in the clinical skills assessment is the reason for the difference in failure rates for international or black or ethnic minority candidates, in direct contradiction to the conclusions of a recent GMC-commissioned review into the MRCGP exam.
The study, published in the British Journal of General Practice, found significant differences between the grades awarded to international medical graduates (IMGs), UK graduates, black and ethnic minority (BME) candidates and male candidates depending on the characteristics of the examiner.
But these differences were not seen after they were put through an analysis of variance (ANOVA), apart from the impact of gender on the results gained by candidates, with male examiners marking more harshly.
The researchers concluded that there was no support for the suggestion that the diversity of examiners must be improved, instead recommending that each examiner is subjected to an ‘equality audit’ to ensure they are not favouring one group over another.
The findings are in direct opposition to the conclusions of GMC-commissioned research conducted by racism expert Professor Aneez Esmail who said that that increasing the diversity of the examiners to be ‘more reflective of the nature of GPs in the country’ would improve the way the CSA is run.
A different version of his research, published in the BMJ, concluded that subjective bias ‘due to racial discrimination’ may be a cause of the higher failure rates for UK trained candidates from non-white ethnic groups and IMGs.
This new study – led by the clinical research lead at the RCGP, Dr Mei Ling Denney – included data from almost all of the candidates taking the CSA in 2011/12, and found no consistent evidence of examiners ‘favouring their own’.
In fact, the raw data from 52,000 CSA attempts and 251 examiners showed that examiners who were IMGs gave lower marks to both UK graduates and fellow IMGs and there was no significant association between grades from white examiners marking white candidates.
They showed that a ‘slightly lower mark’ was awarded to candidates of the opposite ethnicity than to those of the same; but the impact of this was minimal, accounting for 0.06% of total variance in case score.
The authors concluded: ‘The effect size of the potential significant raw effects found in this study, regarding any individual candidate case, could result in, for example, male candidates receiving a 0.9% enhancement of their case score under a female examiner and any candidate receiving, irrespective of their source of degree, a 2.4% enhancement of their case score under a UK-graduate examiner as opposed to an IMG examiner.’
They added: ‘Examiners show no general tendency to “favour their own kind”. With confounding between variables, as far as the impact on candidates’ case scores, substantial effects relate to candidate and not examiner characteristics. Candidate-examiner interaction effects were inconsistent in their direction and slight in their calculated impact.’
‘This study provides no support for equating examiner representation to that of candidates from the point of view of delivering a fair assessment to all groups of candidates. Nevertheless incorporating a variety of subgroups of examiners in the examiner panel has benefits for collegiality and examination development, and incorporating approaches to practice which may themselves vary between these subgroups.’