ChatGPT may produce more empathetic responses than doctors to questions from patients, a new study has claimed.
A team of healthcare professionals read 195 responses to patient questions from social media forum Reddit and rated the answers from the artificial intelligence chatbot more highly than those provided by doctors.
The study’s authors from the University of California concluded in JAMA Internal Medicine that further research should be done on whether AI assistants can help doctors in workflow and drafting responses, which may help reduce clinician burnout.
The healthcare professionals comparing the responses – without being told which was real and which generated by ChatGPT – were asked to look at both the quality of information and bedside manner.
They preferred the ChatGPT responses in 79% of cases and overall rated them as of significantly higher quality.
The prevalence of chatbot responses that were graded as ’empathetic’ or ‘very empathetic’, was almost 10 times higher than the doctor-written responses, the researchers reported.
The authors said: ‘While this cross-sectional study has demonstrated promising results in the use of AI assistants for patient questions, it is crucial to note that further research is necessary before any definitive conclusions can be made regarding their potential effect in clinical settings.’
‘Studying the addition of AI assistants to patient messaging workflows holds promise with the potential to improve both clinician and patient outcomes,’ they added.
Responses from ChatGPT were on average four times longer which suggests in practice that those evaluating them would have been able to guess which was which, experts said.
It was also not a level comparison as the clinicians were responding on a public forum which may have affected how empathetic the responses were, Professor James Davenport, professor of information technology, at the University of Bath pointed out.
But he said it does raise legitimate questions about whether and how ChatGPT could ‘assist physicians in response generation’.
Professor Mirella Lapata, professor of natural language processing at the University of Edinburgh, said it was not surprising that healthcare professionals preferred ChatGPT to physician responses.
She said: ‘ChatGPT is more empathetic and overall chattier. Without controlling for the length of the response, we cannot know for sure whether the raters judged for style (eg, verbose and flowery discourse) rather than content.’
Professor Anthony Cohn, professor of automated reasoning at the University of Leeds, said: ‘Given ChatGPT’s well known abilities to “write in the style of”, it is not surprising that a chatbot is able to write text that is generally regarded as empathetic.’
He added that authors are careful to note that a chatbot should only be used as a tool to draft a response to a patient query – given the propensity of this type of technology to invent ‘facts’ and hallucinate, it would be dangerous to rely on any factual information given.
‘It is essential that any responses are carefully checked by a medical professional. However, humans have been shown to overly trust machine responses, particularly when they are often right, and a human may not always be sufficiently vigilant to properly check a chatbot’s response; this would need guarding against.’