Whether on ChatGPT, Grok, Google Gemini, or even Mistral, if you use artificial intelligence, you have most certainly received fanciful, original, or even… false answers.
The most attentive users will have noticed a message appearing under each answer indicating that “ChatGPT can make mistakes” and that it is even “recommended to check important information”.
A study explains AI hallucinations
While AI ramblings, which artificial intelligence specialists call “hallucinations”, are not as frequent as they were when ChatGPT launched in 2022, they were until now inexplicable to the general public. In 2024, a first study from Stanford University had already demonstrated that hallucinations concern 10% of requests.
But it is more recently, in a study published on Hugging Face this Wednesday and conducted by Giskard, a French company specializing in analysis and artificial intelligence, that we were able to learn more about the reasons for these hallucinations…
Beware of short questions!
Among the elements revealed by this study, it is first of all short questions that are singled out. Being judged too imprecise, sometimes ambiguous, and without any context, they can completely confuse artificial intelligence.
According to the researchers, this remains all the more problematic as we observe more and more short answers on certain models to limit costs.
Models more conducive to hallucinations
Giskard having carried out a study on several models, this also allowed us to understand that certain models could be more conducive to hallucinations. And while one might imagine that only the least advanced models are affected, this is unfortunately not the case.
Indeed, the study shows that GPT-4o, which is currently the OpenAI model used by default on ChatGPT, is precisely one of the models that is the most amazing. This tool being used by one in ten people in the world according to Sam Altman in a recent TED conference, the observation remains worrying.
Among the other models identified, Giskard mentions Mistral Large, but also Claude 3.7 Sonnet from Anthropic.
A user experience at the expense of reality?
While the phenomenon of hallucinations is nothing new in artificial intelligence, this study highlights a fact that should not be taken lightly.
Indeed, with increasingly massive use, and increasingly high costs, companies like OpenAI or Anthropic are faced with choices in their user experience. While short answers make them easier to use and less expensive, they could also encourage misinformation in the future…
0 Comments