In brief
- Research from Oxford University points to AI chatbots giving dangerous medical advice to users.
- While chatbots score highly on standardised tests of medical knowledge, they fall down in personal scenarios, the study found.
- Researchers found that LLMs were no better than traditional methods for making medical decisions.
AI chatbots are fighting to become the next big thing in healthcare, acing standarized tests and offering advice to your medical woes. But a new study published in Nature Medicine has shown that they aren’t just a long way away from achieving this, but could in fact be dangerous.
The study, led by multiple teams from Oxford University, identified a noticeable gap in large language models (LLMs). While they were technically highly advanced in medical understanding, they fell short when it came to helping users with personal medical problems, researchers found.
“Despite all the hype, AI just isn’t ready to take on the role of the physician,” Dr Rebecca Payne, the lead medical practitioner on the study, said in a press release announcing its findings. She added that, “Patients need to be aware that asking a large language model about their symptoms can be dangerous, giving wrong diagnoses and failing to recognise when urgent help is needed.”
The study saw 1,300 participants use AI models from OpenAI, Meta and Cohere to identify health conditions. They outlined a series of scenarios that were developed by doctors, asking the AI system to tell them what they should do next to deal with their medical issue.
The study found that its results were no better than traditional methods of self-diagnosis, such as simply online searching or even personal judgment.
They also found that there was a disconnect for users, unsure of what information the LLM needed to offer accurate advice. Users were given a combination of good and poor advice, making it hard to identify next steps.
Decrypt has reached out to OpenAI, Meta and Cohere for comment, and will update this article should they respond.
“As a physician, there is far more to reaching the right diagnosis than simply recalling facts. Medicine is an art as well as a science. Listening, probing, clarifying, checking understanding, and guiding the conversation are essential,” Payne told Decrypt.
“Doctors actively elicit relevant symptoms because patients often don’t know which details matter,” she explained, adding that the study showed LLMs are “not yet reliably able to manage that dynamic interaction with non-experts.”
The team concluded that AI is simply not fit for offering medical advice right now, and that new assessment systems are needed if it is ever to be used properly in healthcare. However, that doesn’t mean they don’t have a place in the medical field as it stands.
While LLMs “definitely have a role in healthcare,” Payne said, it should be as “secretary, not physician.” The technology has benefits in terms of “summarizing and repackaging information already given to them,” with LLMs already being used in clinic rooms to “transcribe consultations and repackage that info as a letter to a specialist, information sheet for the patient or for the medical records,” she explained.
The team concluded that, although they aren’t against AI in healthcare, they hope that this study can be used to better steer it in the right direction.
Daily Debrief Newsletter
Start every day with the top news stories right now, plus original features, a podcast, videos and more.