DeepMind’s new AI system, AlphaProof, represents a major breakthrough in mathematical reasoning. Unlike typical AI models that predict the next word or token, AlphaProof is designed to understand and build mathematical proofs with rigor. It operates within a formal proof assistant called Lean, where mathematical statements are translated into a strict language, allowing the system to check each logical step for correctness.
To train AlphaProof, DeepMind overcame a key challenge: limited formal math data. Much of mathematics exists in natural language, not in the formal syntax required by proof assistants. To bridge this gap, they used a Gemini-based model to automatically translate millions of natural-language math problems into Lean definitions, generating a large dataset of formalized statements for training.
AlphaProof was tested on problems from the International Mathematical Olympiad (IMO). While it couldn’t operate in real time like a human competitor — taking days per problem on large compute clusters — it managed to solve several of the hardest tasks. The system even calls a specialized geometry engine, AlphaGeometry 2, when geometric insight is needed.
Still, AlphaProof has limitations. Running it is expensive — hundreds of TPU-days per problem — which restricts access to well-funded labs. DeepMind acknowledges this and aims to optimize it for broader, long-term use, potentially helping researchers with advanced, research-level mathematics in the future.