By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
Times CatalogTimes CatalogTimes Catalog
  • Home
  • Tech
    • Google
    • Microsoft
    • YouTube
    • Twitter
  • News
  • How To
  • Bookmarks
Search
Technology
  • Meta
Others
  • Apple
  • WhatsApp
  • Elon Musk
  • Threads
  • About
  • Contact
  • Privacy Policy and Disclaimer
© 2025 Times Catalog
Reading: OpenAI’s new reasoning AI models hallucinate more
Share
Notification
Font ResizerAa
Font ResizerAa
Times CatalogTimes Catalog
Search
  • News
  • How To
  • Tech
    • AI
    • Apple
    • Microsoft
    • Google
    • ChatGPT
    • Gemini
    • YouTube
    • Twitter
  • Coming Soon
Follow US
  • About
  • Contact
  • Privacy Policy and Disclaimer
© 2025 Times Catalog
Times Catalog > Blog > Tech > AI > ChatGPT > OpenAI’s new reasoning AI models hallucinate more
AIChatGPTOpenAITech

OpenAI’s new reasoning AI models hallucinate more

Usama
Last updated: April 19, 2025 4:18 pm
Usama
Share
5 Min Read
OpenAI’s new reasoning AI models hallucinate more
SHARE

OpenAI’s newest AI models, o3 and o4-mini, are being hailed as cutting-edge advancements in artificial intelligence. Designed specifically for enhanced reasoning tasks, these models outperform their predecessors in many key areas, including math, coding, and logical problem-solving. However, beneath their impressive capabilities lies a persistent and growing problem—they hallucinate more than earlier models.

Contents
Hallucinations: A Growing Concern in AIStronger Performance, But At What Cost?When Models Make Things UpReal-World Impacts of HallucinationsA Possible Solution? Web Search IntegrationThe Road Ahead

Hallucinations: A Growing Concern in AI

In the AI world, a “hallucination” refers to a model generating false or misleading information that it presents as factual. Despite years of improvements, hallucinations remain one of the most difficult challenges in AI development. Traditionally, each generation of models has seen gradual progress in reducing hallucinations. But the release of o3 and o4-mini appears to buck that trend.

Internal tests conducted by OpenAI show that both o3 and o4-mini hallucinate more frequently than not just their predecessors—o1, o1-mini, and o3-mini—but also than OpenAI’s more generalized models like GPT-4o. And perhaps more troubling: OpenAI isn’t entirely sure why.

Stronger Performance, But At What Cost?

In its technical documentation, OpenAI acknowledges this unexpected setback, stating that “more research is needed” to fully understand why these advanced reasoning models are hallucinating more often. The models are undoubtedly stronger in certain domains. They excel at coding tasks, complex math problems, and intricate reasoning exercises. However, their tendency to generate more output overall may be leading to a rise in both correct and incorrect claims.

For example, in one of OpenAI’s benchmarks known as PersonQA—which evaluates a model’s accuracy when answering questions about individuals—o3 hallucinated on 33% of questions. That’s more than double the hallucination rates of o1 (16%) and o3-mini (14.8%). O4-mini fared even worse, with a troubling 48% hallucination rate on the same benchmark.

When Models Make Things Up

Independent researchers have also begun scrutinizing these models, uncovering further evidence of their unpredictability. In some instances, o3 has fabricated detailed processes, such as claiming it ran code on a physical 2021 MacBook Pro “outside of ChatGPT,” then copied results into its response. While these kinds of imaginative claims might make AI sound smarter or more human-like, they’re entirely false. The model simply can’t perform such actions.

Some experts speculate that the issue might stem from the reinforcement learning techniques used in training the o-series models. These techniques are designed to enhance reasoning and decision-making but may unintentionally amplify behaviors typically mitigated during post-training fine-tuning.

Real-World Impacts of Hallucinations

While hallucinations may occasionally produce creative or novel ideas, their presence can be problematic—especially in contexts where factual accuracy is critical. In legal, medical, academic, and enterprise settings, even a single incorrect statement could lead to serious consequences.

One example of practical frustration comes from coding professionals who’ve adopted o3 into their workflows. While the model’s output is generally more capable, it has been observed to generate broken or imaginary links, which can disrupt productivity and erode trust in its reliability.

A Possible Solution? Web Search Integration

One potential way to reduce hallucinations is to give AI models access to real-time web search capabilities. OpenAI’s GPT-4o, when paired with web browsing, achieves 90% accuracy on SimpleQA, another internal benchmark that measures factual correctness. By grounding model outputs in verified, up-to-date information from the internet, hallucination rates could be reduced—though this comes with trade-offs such as exposing user prompts to third-party services.

The Road Ahead

As the AI industry pivots increasingly toward reasoning models, the hallucination problem becomes more pressing. These models show immense promise—offering higher-level cognitive abilities and stronger performance without necessarily increasing training data or compute costs. But if enhanced reasoning consistently brings with it a spike in hallucinations, developers will need to act fast to find a balance between intelligence and accuracy.

OpenAI has confirmed that addressing hallucinations remains a top priority, and ongoing research is underway to tackle the issue across all current and future models. Until then, users—especially in high-stakes environments—must tread carefully, verifying AI-generated content where possible.

You Might Also Like

Logitech’s MX Creative Console now supports Figma and Adobe Lightroom

Samsung resumes its troubled One UI 7 rollout

Google Messages starts rolling out sensitive content warnings for nude images

Vivo wants its new smartphone to replace your camera

Uber users can now earn miles with Delta Air Lines

Share This Article
Facebook Twitter Pinterest Whatsapp Whatsapp Copy Link
What do you think?
Love0
Happy0
Sad0
Sleepy0
Angry0
Previous Article AI-powered robot dog is learning to 'live' like a human AI-powered robot dog is learning to ‘live’ like a human
Next Article ChatGPT is referring to users by their names unprompted, and some find it ‘creepy’ ChatGPT is referring to users by their names unprompted, and some find it ‘creepy’
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

144FollowersLike
23FollowersFollow
237FollowersPin
19FollowersFollow

Latest News

Pinterest is prompting teens to close the app at school
Pinterest is prompting teens to close the app at school
News Tech April 22, 2025
ChatGPT search is growing quickly in Europe, OpenAI data suggests
ChatGPT search is growing quickly in Europe, OpenAI data suggests
AI ChatGPT OpenAI April 22, 2025
social-media-is-not-wholly-terrible-for-teen-mental-health-study-says
Social media is not wholly terrible for teen mental health, study says
News April 22, 2025
Google is trying to get college students hooked on AI with a free year of Gemini Advanced
Google is trying to get college students hooked on AI with a free year of Gemini Advanced
AI Gemini Google Tech April 19, 2025
Times CatalogTimes Catalog
Follow US
© 2025 Times Catalog
  • About
  • Contact
  • Privacy Policy and Disclaimer
Welcome Back!

Sign in to your account

Lost your password?