Scientists made AI agents ruder — and they performed better at complex reasoning tasks

When artificial intelligence (AI) is allowed to behave more like a human communicator, it becomes a more effective debate partner that reaches more accurate conclusions, scientists have found.

Human communication is full of stops and starts, impassioned interruptions, unsure silences and ambiguity. AI, on the other hand, adheres to the formal communication style of computers — processing a command, formulating a response, delivering the output, and waiting patiently for the next command.

“Current multi-agent systems often feel artificial because they lack the messy, real-time dynamics of human conversation,” co-author of the study Yuichi Sei, Professor, Department of Infomatics at Tokyo’s University of Electro-Communications in Japan, said in a statement. “We wanted to see if giving agents the social cues we take for granted, like the ability to interrupt or the choice to stay quiet, would improve their collective intelligence.”

Sei and his co-workers proposed a framework where large language models (LLMs) didn’t have to adhere to the back-and-forth, wait-your-turn nature of computerized communication. Instead, an LLM could be assigned a personality that let it speak out of turn, cut off other speakers, or remain silent.

Beyond creating more humanlike methods of AI communication, the researchers found that such flexibility led to higher accuracy on complex tasks compared with that of standard LLMs.

A host of personalities

The team started by integrating traits into LLMs according to the “big five” personality types from classical psychology — openness, conscientiousness, extraversion, agreeableness and neuroticism.

The next step was to reprogram text-based LLMs to process responses sentence by sentence rather than generating a full response before the next one started, which allowed the researchers to carefully control the flow of discussion. They also compared the results between three conversational settings — fixed speaking order, dynamic speaking order, and dynamic speaking order with interruption enabled. The latter enabled the model to calculate an “urgency score” that let them grasp and process the conversation in real time.

The urgency score was expressed in the conversation in several ways. If it spiked because the model spotted an error or a point it considered critical to the discussion, it could raise this immediately, regardless of whose turn it was to speak. If the urgency score was low, the model interpreted this as having nothing concrete to add, which reduced conversational “clutter” for its own sake.

Sei told Live Science that the team evaluated performance using 1,000 questions from the Massive Multitask Language Understanding (MMLU) benchmark — an AI reasoning test encompassing questions from different areas, including science and humanities.

“When one agent initially gave an incorrect answer, overall accuracy was 68.7% with fixed-order discussion, 73.8% with dynamic order, and 79.2% when interruption was allowed,” Sei said. “In a more difficult setting where two agents initially gave incorrect answers, accuracy was 37.2% with fixed order, 43.7% with dynamic order, and 49.5% with interruption enabled.”

Having shown that the personality-driven models were more accurate than traditional AI chatbots, Sei now wants to explore how these new findings can be applied in practice. The team plans to apply their findings to various domains featuring creative collaboration to understand the dynamic around how “digital personalities” can play out in decision-making within a group.

“In the future, AI agents will increasingly interact with one another and with humans in collaborative settings,” said Sei. “Our findings suggest that discussions shaped by personality, including the ability to interrupt when necessary, may sometimes produce better outcomes than strictly turn-based and uniformly polite exchanges.”

What's On

Yankees retiring CC Sabathia’s number shows the sad state of our standard for greatness

Donnie Wahlberg Reveals If Wife Jenny McCarthy — Or Brother Mark Wahlberg — Could Make Cameos on ‘Boston Blue’ (Exclusive)

Nancy Mace bashes Ilhan Omar over Iran attack after ‘Squad’ rep claims US ‘loves to strike Muslim countries during Ramadan’

Giant string of organic molecules on Mars may be one of the best signs of life yet

‘We’re starting to find a lot more weirdness’: These strange animals can control their body heat

Paleolithic humans invented an ‘early predecessor to writing’ at least 40,000 years ago, carved signs suggest

Science news this week: ‘Spiderwebs’ on Mars, tigers’ return to Kazakhstan, and 2,000-year-old skull with permanently blackened teeth

Did the Vikings reach Maine?

Stone Age boy in Sweden was buried in deerskin and a woodpecker headdress, archaeologists discover

Acing this new AI exam — which its creators say is the toughest in the world — might point to the first signs of AGI

‘It doesn’t lie. So who are you?’: What happens when DNA tests show a woman is not the mother of the child she gave birth to?

Just in time for the total lunar eclipse, this beginner-friendly telescope is now $100 off at Amazon

What's On

Scientists made AI agents ruder — and they performed better at complex reasoning tasks

Related Articles