This site is intended for health professionals only

What happened when Pulse tested symptom checker apps

We asked four experienced GPs to each test out four symptoms checkers. They were given instructions to go through the eight different scenarios listed in the table for each of the apps (ie, anxious 50yo male with a cough with bruised rib).

The GPs were told to input the following when asked for the initial symptoms:

  • chest pain in the case of the cough with bruised rib;
  • toothache in the case of the dental abscess;
  • rash in the case of shingles;
  • back pain in the case of acute pyelonephritis.

After doing so, they were asked to take the route that follows for the patients based on what they would say with their condition and their levels of concern: ‘anxious’ patients would fear/assume the worst with potential to maximise symptoms; ‘non-anxious’ patients were the opposite. There was an element of subjectivity following the initial input of symptoms, as there is with patients.

We then asked them to record the most urgent advice given in each scenario – the thinking being the patient would follow the most urgent advice to play safe. The advice varied among the four GP testers. With this in mind, we picked out the most common advice below and, where given, potential diagnoses.

(Click to expand image)

symptoms checker screengrab 525x350px

symptoms checker screengrab 525x350px

What our testers said

• The apps were successful at spotting serious conditions, such as a heart attack. They worked quickly, and were easy to use.

• However, our ‘anxious’ patients were told by some to call an ambulance for coughing with a bruised rib, while most patients with shingles were told to seek medical help within a few hours.

• Ada was the most accurate ‘diagnostic’ app, but while it correctly diagnosed a dental abscess, it advised seeing a doctor – possibly a failure to localise it to the UK. Your.MD made a similar suggestion.

• Your.MD suggested meningitis in many shingles scenarios, which would cause anxiety. For one tester, it suggested an anxious patient with acute pyelonephritis did not need to take action. For the other testers, it suggested cystitis or back sprain but advised they go to A&E just in case. Your.MD said it will review lower and upper UTIs after Pulse’s feedback.

• Most of the apps in all scenarios bar the dental one advised seeing a GP urgently or go to A&E/call 999. Many of the suggestions would increase patient anxiety, which would be detrimental in the long term.

Apps developers’ responses

NHS Digital (for NHS App)

‘111 Online is designed to direct patients to the right care provider within the right timeframe, it is not designed to provide a diagnosis.

‘111 Online algorithms follow a robust process of clinical development to ensure that they follow the latest clinical evidence, and are reviewed by the independent National Clinical Governance Group, chaired by the Royal College of General Practitioners.’


‘Over 60 expert doctors, including practicing NHS clinicians, input into Ada’s platform and rigorously test outcomes against gold-standard diagnosis and we’re happy to see that the experiment confirmed almost 90% accuracy in Ada’s results.

‘Furthermore, as regards advice levels, Ada is able to take into account the full symptom picture, differential results and other red flag features noted throughout the assessment, as a good clinician would, not the most likely suggested condition alone.

‘Based on this, and without seeing the full list of presenting symptoms or the standards applied it’s hard to comment further other than to say our clinical team felt Ada provided safe advice inline with appropriate guidance.

‘Ultimately Ada is designed to be applied in conjunction with healthcare professionals as part of full care pathway and as such we welcome the dialogue this test enables.’


Matteo Berlucchi, CEO of Your.MD, said:

‘Your.MD is designed to triage people and when self-care options are viable to give more specific advice on how to deal with the situation without the need to refer to primary care or A&E thus reducing the overall burden to the healthcare system.

‘We focus on trying to help people assess whether they need medical help or not (next steps) and not on the diagnosis. This is why our Consultation Report is presented after we explain what next steps to take. We believe that giving a diagnosis is something that belongs with GPs and not an app.

‘Our aim is to help reduce overload on primary care and we are transparent with our users in doing so. Our service is tailored to presentations as made by real patients rather than GPs. This is in line with our objective to develop and support pre-primary care.

‘We realised pretty early on in our life (we are four years old) that we needed to put in place a rigorous clinical safety policy matched by a robust clinical risk management system. For this reason we have worked with some of the leading experts in the space and we are very proud with the level of safety, transparency and explainability we have achieved today in particular compared to other companies in our space.

‘We believe user safety to be paramount for symptom checkers and this is why we have made it the top priority in our design.’

And on the acute pyelonephritis reading…

‘For the symptoms entered by the tester, Cystitis and Pyelonephritis are both plausible differential diagnoses.

‘Each condition outcome has a “When to worry” section, to ensure safety nets are in place.

‘The ‘When to worry’ for Cystitis directs the user to seek medical help, as does the disclaimer.

‘Nevertheless, your feedback has prompted our medical team to do a review of upper and lower urinary tract infections.

‘Any necessary recalibration will be reflected in the next couple of weeks and we welcome you to return and test these changes once they have been implemented.

‘As I am sure you can appreciate, the system is extremely complex given the enormous number of variables, so the relative calibration of the key factors which define each scenario is a very laborious and delicate task and as I have already mentioned in our previous response.

‘We are confident about the quality of our system and in particular of its safety aspects.’


Dr Keith Grimes, director of clinical innovation at Babylon, said:

‘Our Symptom Checker is a totally free piece of technology and is instantly available for anyone to use, whenever they want, giving them more information about possible causes of their symptoms and what they should do to seek treatment. It’s a tool that helps in addition to getting a diagnosis by a GP and our users love it, but we need to do a better job explaining to doctors that it’s different to what they do in a consultation as our results are based on a statistical basis of likelihood.

‘I’m a GP who has worked in urgent care since 2001 and was lead GP at a walk-in centre for 8 years. The priority was to work with patients, ruling out their illnesses in order of severity and trying to give them a diagnosis, but ensuring that I investigate the most serious potential issues first. There are pros and cons to this – I am very good at using my experience to try different routes and work out what is likely to be wrong.

‘However, I am less good at actually being aware of all the potential illnesses and working out, statistically, which is most likely. Humans simply aren’t very good at statistics – and as a result, we miss things. For example, once a GP finds out they missed something once they will always overcompensate for that in the future. Our Symptom Checker works in a different way. It assesses all the possible illnesses that could be behind the symptoms and rates their likelihood. This means we’re less likely to miss something that is rare, or unusual. But it also means that even if something could potentially be serious that a GP might flag, we remind the user that another illness is far, far more likely.

‘Our Symptom Checker is still fairly new – it has been running for a couple of years (and has never missed a serious case) but its strength is that we’re updating it every two weeks and it never forgets or regresses. We’re also learning from how people use it. For example, we have found that if the Symptom Checker reports cancer, patients are less likely to get themselves checked out, however, if we list it as ‘Potentially something serious’ then they are far more likely to speak to their GP.

‘In your example of shingles, we think we have given the correct response as it can be extremely painful, can lead to long-term pain, NICE says antivirals should be taken within 72 hours of onset and of course the patient can potentially be spreading chicken pox. Our service actually can offer GP appointments within a few hours and it can really easily be tested through video, so for us to suggest that to users makes perfect sense. Similarly, if a patient is anxious, they can press a button and speak to a GP, or ask a question, day or night.’