Older AI models show signs of cognitive decline, study shows

People increasingly rely on artificial intelligence (AI) for medical diagnoses because of how quickly and efficiently these tools can spot anomalies and warning signs in medical histories, X-rays and other datasets before they become obvious to the naked eye. But a new study published Dec. 20, 2024 in the BMJ raises concerns that AI technologies like large language models (LLMs) and chatbots, like people, show signs of deteriorated cognitive abilities with age.

“These findings challenge the assumption that artificial intelligence will soon replace human doctors,” the study’s authors wrote in the paper, “as the cognitive impairment evident in leading chatbots may affect their reliability in medical diagnostics and undermine patients’ confidence.”

Scientists tested publicly available LLM-driven chatbots including OpenAI’s ChatGPT, Anthropic’s Sonnet and Alphabet’s Gemini using the Montreal Cognitive Assessment (MoCA) test — a series of tasks neurologists use to test abilities in attention, memory, language, spatial skills and executive mental function.

MoCA is most commonly used to assess or test for the onset of cognitive impairment in conditions like Alzheimer’s disease or dementia. Subjects are given tasks like drawing a specific time on a clock face, starting at 100 and repeatedly subtracting seven, remembering as many words as possible from a spoken list, and so on. In humans, 26 out of 30 is considered a passing score (ie the subject has no cognitive impairment.

Related: ChatGPT is truly awful at diagnosing medical conditions

While some aspects of testing like naming, attention, language and abstraction were seemingly easy for most of the LLMs used, they all performed poorly in visual/spatial skills and executive tasks, with several doing worse than others in areas like delayed recall.

Crucially, while the most recent version of ChatGPT (version 4) scored the highest (26 out of 30), the older Gemini 1.0 LLM scored only 16 — leading to the conclusion older LLMs show signs of cognitive decline.

The study’s authors note that their findings are observational only — critical differences between the ways in which AI and the human mind work means the experiment cannot constitute a direct comparison. But they caution it might point to what they call a “significant area of weakness” that could put the brakes on the deployment of AI in clinical medicine. Specifically, they argued against using AI in tasks requiring visual abstraction and executive function.

It also raises the somewhat amusing notion of human neurologists taking on a whole new market — AIs themselves that present with signs of cognitive impairment.

What's On

Final ruling bars far-right Georgescu from Romanian vote

Cam Schlittler eager to prove himself with Gerrit Cole’s Yankees rotation spot open

Tiger Woods Undergoes Surgery for Ruptured Achilles Tendon, Focused on Rehab and Recovery

‘Take shelter!’: Tornado strikes Florida’s Seminole County, destroying homes and interrupting live TV broadcast

Evidence for Stephen Hawking’s unproven black hole theory may have just been found — at the bottom of the sea

‘A political division, not a physical one, determined who got measles and who didn’t’: Lessons from Texarkana’s 1970 outbreak

Scientists discover giant blobs deep inside Earth are ‘evolving by themselves’ — and we may finally know where they come from

Garmin Venu 3 review | Live Science

The universe’s water is billions of years older than scientists thought — and may be nearly as old as the Big Bang itself

East Asians who can digest lactose can thank Neanderthal genes

Apple Watch Series 10 hits its lowest-ever price — this Apple deal is just too good to miss

China creates powerful spy satellite capable of seeing facial details from low orbit

What's On

Older AI models show signs of cognitive decline, study shows

Related Articles