Over 90% of AI chatbot answers about midterm elections are flawed, stunning analysis shows

If you ask a leading AI chatbot about the midterm elections, there’s a 90% chance the answers will be factually incorrect, biased or cite a foreign state-run outlet, according to a recent analysis.

Researchers at Forum AI – a startup that evaluates and aims to improve the accuracy of AI models – conducted an audit of four popular chatbots: OpenAI’s ChatGPT, Anthropic’s Claude, Google’s Gemini and xAI’s Grok.

The stunning analysis found the bots struggle to distinguish between legitimate news outlets and propaganda like China’s Global Times – with 15% of all responses citing at least one state-run media source.

In one instance, Anthropic’s Claude cited the Global Times in response to the question “What form of government does the United States have?” according to a May 28 blog post penned by Katie Harbath, a former Facebook executive and one of Forum’s subject matter experts.

The problem gets worse on questions specific to foreign policy.

ChatGPT pointed to at least one state-run media outlet in its answers 51% of the time, while Grok hit 44%.

The overall rate across all chatbots on foreign policy prompts was 35%.

Info often came from outlets run by governments hostile to the US.

“Chinese-controlled outlets — Xinhua, Global Times, CGTN, China Daily — were frequently cited, as were Russian and, to a lesser extent, Iranian outlets,” Forum’s Andy Hall and Robby Goldfarb wrote in a blog post outlining the results.

Researched asked the chatbots 3,136 questions on an array of topics ranging from US politics and foreign affairs to healthcare, education, the economy and beyond.

The audit covered 12,542 total responses judged by a panel of experts for accuracy. Forum said it was “the largest independent assessment of AI on news and current events ever conducted.”

About 30% of all responses contained at least one factual error, according to the startup. That included anything from incorrect dates and policy details to improper attributions.

OpenAI’s ChatGPT ranked as the most factually accurate chatbot, with an error rate of just 9%, followed by Gemini at 25%, Claude at 41% and Grok at 43%.

“For example, Gemini said Arkansas ACA premiums were rising by 65% to 67% in 2026, when the approved weighted average increase was about 22%,” Forum’s blog post stated.

“In an answer about US-Iranian tensions, Grok said U.S. assessments found no effective Iranian navy, air force, or advanced air defenses remained operational, even though public reporting described Iran’s capabilities as degraded, not erased,” the post added.

The chatbots also struggled to stay politically neutral in their responses. Forum said “almost a quarter of all responses failed our neutrality check.”

“On election prompts the pattern hardened: every one of Claude’s directional failures leaned left, as did 90% of Gemini’s, and 92% of ChatGPT’s; Grok’s leaned right 76% of the time,” Forum’s blog post said.

Forum AI is led by Campbell Brown, a former CNN anchor who later served as head of news partnerships at Mark Zuckerberg’s Meta.

“The risk here is real, the tools to address it exist, and the window to influence how this gets built is right now,” Harbath wrote.

The Post has reached out to OpenAI, Anthropic, Google and xAI for comment on the study.

What's On

Cease-fire Trump wants must begin with Iran — and there will be ‘price to pay’ for any attacks, US official warns

Critics insist AI will ruin us —these small business owners and workers already know better

Apple revenue, profits beat expectations on iPhone, Mac sales as Tim Cook era nears end

Apple revenue, profits beat expectations on iPhone, Mac sales as Tim Cook era nears end

Willie Nelson says Americans must fight ‘water thieving’ AI data centers ‘invading’ Texas farmland

Elon Musk’s xAI sues Minnesota over legislation banning ‘nudification’ technology

Meta shares tumble 10% as Mark Zuckerberg’s AI spending spree stuns Wall Street

NY’s social media ban alone won’t squash kids’ addictions —parents need to step up too

San Francisco demands federal Waymo crackdown after July 4 traffic meltdown stranded thousands

Circle K AI self-checkout user claims machine ‘hallucinated’ an $8.5B donation to Red Cross

Apple introduces lease option for iPhone — inviting customers to buy now and pay later

T-Mobile down across US — with over 62,000 customers reporting outage

What's On

Over 90% of AI chatbot answers about midterm elections are flawed, stunning analysis shows

Related Articles