This site is intended for health professionals only

At the heart of general practice since 1960

Read the latest issue online

CAMHS won't see you now

Babylon claims its in-app clinical advice is ‘on par’ with GPs

Private GP provider Babylon has claimed its app is able to provide clinical advice to patients that is ‘on par’ with doctors, sparking criticism from GP leaders.

Babylon, which also provides its GP at Hand service for NHS patients, said it had used diagnostic questions from trainee GP exams to test the artificial intelligence behind its app, which features a symptom checker and provides patients with medical information and triage advice.

The company said its AI scored 81% when it was tested using RCGP exam questions, whereas the average mark for real-life doctors was 72%, according to data from 2013 to 2018.

However, the RCGP said the claims were ‘dubious’. It highlighted Babylon had used its MRCGP exam preparation questions, which were for ‘revision purposes,’ and that these ‘are not necessarily representative of the full-range of questions and standard used in the actual MRCGP exam’.

The RCGP’s vice chair, Professor Martin Marshall, said: ‘To say that Babylon’s algorithm has performed better than the average MRCGP candidate is dubious.’

Babylon also carried out further tests by collaborating with the Royal College of Physicians, Stanford University’s chief of general primary care, Dr Megan Mahoney, and Yale New Haven Health’s chief population health officer, Dr Arnold DoRosario, which involved using 100 independently devised symptom sets.

According to Babylon, during this further testing, its AI scored 80% for accuracy, while the seven doctors it compared results with achieved an accuracy range of 64 - 94%.

Babylon also said the app’s safety score was found to be 97%, while the GPs were assessed as having an average of 93.1%.

Babylon launched its research on the app’s ability to provide accurate health information at the Royal College of Physicians in London tonight.

Ali Parsa, Babylon’s founder and CEO, said: ‘Babylon’s latest artificial intelligence capabilities show that it is possible for anyone, irrespective of their geography, wealth or circumstances, to have free access to health advice that is on par with top-rated practising clinicians.’

‘Tonight’s results clearly illustrate how AI-augmented health services can reduce the burden on healthcare systems around the world.'

However the RCGP’s Professor Marshall said that while the ‘potential of technology to support doctors to deliver the best possible patient care is fantastic… at the end of the day, computers are computers, and GPs are highly-trained medical professionals’.

‘The two can’t be compared and the former may support, but will never replace, the latter,’ he said.

‘No app or algorithm will be able to do what a GP does…

‘An app might be able to pass an automated clinical knowledge test but the answer to a clinical scenario isn’t always cut and dried, there are many factors to take into account, a great deal of risk to manage, and the emotional impact a diagnosis might have on a patient to consider,’ he added.

Readers' comments (21)

  • Vinci Ho

    As a fan of George , I recommend you read this article about the debate between Wells and Orwell over 75 years ago(even before NHS was born) , inspiring and enlightening:

    https://www.scientificamerican.com/article/h-g-wells-versus-george-orwell-their-debate-whether-science-is-humanitys-best-hope-continues-today/

    Please send in more comments on this platform...

    Unsuitable or offensive? Report this comment

  • The exam is set by a committee based upon their knowledge of recognised patterns.It is verbal choreography.To pass you simply follow the recognised steps, and change direction, when the key words or phrases appear.Trainers have taught this for years without the benefit of computers.
    There was surely never any question that a computer would be able to recognise these patterns.
    Remember we are talking about a constructed exam using actors, not real life.
    It of course follows, that if you reverse the program, it could set the exam, and conduct it, thus dispensing with the variability of actors,
    and markers.Nobody is going to accuse a computer of racism.
    The computer could give each candidate an analysis of his performance,where he asked the wrong question, what the right question was,how many marks he lost,what he would have to do to pass.
    Candidates could sit formative practice exams as many times as they wished before the summative one.
    Of course there is more to real practice,but there is not not more to the exam.

    Unsuitable or offensive? Report this comment

  • Babylon has its place, with the worried well 18-30 year metropolitan urban elite, for everyone else currently, old fashioned face to face Gp is the easier option
    Both models are not exclusive and each has a role, defining their roles is the tricky bit and funding each appropriately is key
    In rural NI In areas without even a phone sign let alone internet access this looks like a debate from a science fiction era,
    Get the basics right first, then fripperies like these can be considered an option (maybe)

    Unsuitable or offensive? Report this comment

  • When the patients are not better or the diagnosis is wrong we can encourage them to complain and sue babylon. Why should we be cleaning up after them?If they caused a problem, they can jolly well fix it.NHS is NHS and private is private.

    Unsuitable or offensive? Report this comment

  • It's interesting that Babylon have published their own paper again, why is it not being published in a peer reviewed journal? Why did they claim yesterday that their AKT performance was 81% when in actual fact, according to their own paper, it appears that they counted a "correct" answer if it appeared in their top 3 differentials? I expect the GP pass mark would be higher than 72% if that was how the AKT worked in reality. New funding round due perhaps? https://marketing-assets.babylonhealth.com/press/BabylonJune2018Paper_Version1.4.2.pdf

    Unsuitable or offensive? Report this comment

  • AI is more suited to hospital medicine than primary care medicine. In Primary care there is often no definitive diagnosis because the patient presents early in the history with undifferentiated symptoms. Then of course the patient really comes with a hidden agenda. They are not really their for their cough or their rash that was just an excuse to get through the door. It maybe helpful as a tool for inexperienced doctors or rare diseases. The irony is they are rolling it out for hard primary care problems when it would work better in secondary were the problems are easier. But I guess there is no market in secondary care.

    Unsuitable or offensive? Report this comment

  • guess it"s a bit like 111. Look where that got us.

    Unsuitable or offensive? Report this comment

  • My main concerns are with NHS Chairman Malcolm Grant who chaired the event.

    He has been struggling for a while and this adds more fuel to arguments around his questionable judgment.

    Unsuitable or offensive? Report this comment

  • Macaque

    I like to dream that in the future there will be AI doctors (computers) and human therapists.

    Computers do the very clever logic with access to terabytes of medical information where the human brain can not do.

    Human therapist will watch over the AI doctor and make the human connection with the patient.

    As of today, I would benefit as a GP if a computer can listen to the consultation, do the AI magic and show me a differential diagnosis on screen while I talk to the patient. Also the computer can prompt specific questions, examinations or investigations to narrow the differential diagnosis. I think I will be able to use such a computer without interfering too much with the human-human interaction with the patient.

    Unsuitable or offensive? Report this comment

  • This is desperation from Babylon, they’ve been beaten by Ada Health who have a far superior product.

    Unsuitable or offensive? Report this comment

View results 10 results per page20 results per page50 results per page

Have your say