OpenAI recently faced backlash over its claims about GPT-5's math breakthroughs, with the company initially announcing that the model had solved 10 previously unsolved Erdős problems and made progress on 11 others. However, mathematician Thomas Bloom quickly refuted this claim, stating that GPT-5 merely surfaced existing references to solutions that had already been published elsewhere. The problems were listed as "open" on Bloom's website, but that only meant he personally wasn't aware of any published solution, not that they remained unsolved globally.
The incident drew sharp criticism from industry leaders, including Meta's Chief AI Scientist Yann LeCun, who described the situation as "Hoisted by their own GPTards," and Google DeepMind CEO Demis Hassabis, who called it "embarrassing". OpenAI researcher Sébastien Bubeck later acknowledged that GPT-5 had only found solutions in the literature, but argued that searching the literature is a challenging task. However, critics argue that this isn't a genuine breakthrough, and the incident highlights the growing tension between hype and verification in AI research.
The controversy underscores the importance of human oversight, collaboration with domain experts, and maintaining scientific rigor in AI research. AI models like GPT-5 are incredibly capable at retrieving information, but they're not infallible researchers. The incident also raises questions about verification standards when AI companies announce mathematical discoveries, particularly given the billions of dollars at stake in the competitive AI landscape.
Despite the backlash, GPT-5's capabilities in navigating dense academic material and connecting references scattered across different journals can still be valuable for researchers. OpenAI's mistake serves as a lesson in scientific humility, emphasizing the need for transparency, peer review, and reproducible evidence in AI research. As the AI industry continues to evolve, it's crucial to separate genuine breakthroughs from hype and prioritize responsible communication.