New research from the Oxford Internet Institute indicates that AI chatbots trained to be extra warm, friendly, and empathetic can also become less reliable, according to the BBC.
The researchers analyzed more than 400,000 responses from five different AI models from Meta, Mistral AI, Alibaba, and OpenAI. The results showed that the “kinder” versions more often gave incorrect answers, reinforced users’ misconceptions, and avoided stating uncomfortable truths.
For example, a friendlier model might deal with conspiracy theories about the moon landing more cautiously instead of clearly stating that they are false.
On average, incorrect answers increased by about 7.43 percentage points when the models were made to sound warmer in tone. Cooler and more direct models made fewer mistakes. According to the researchers, AI makes the same trade-off as humans: it sometimes prioritizes being perceived as pleasant rather than being direct.