Pull to refresh
Logo
Daily Brief
Following
Why
AI systems begin solving historic Erdős mathematical problems

AI systems begin solving historic Erdős mathematical problems

New Capabilities
By Newzino Staff |

Automated theorem provers crack open problems posed by the legendary mathematician decades ago

January 11th, 2026: AI Solutions Documented Across Multiple Problem Categories

Overview

For the first time, AI systems are independently solving mathematical problems that stumped human researchers for decades. Since Christmas 2025, 15 problems from the legendary mathematician Paul Erdős's collection have been moved from 'open' to 'solved'—and 11 of those solutions specifically credited AI models. On January 6, 2026, a combination of OpenAI's GPT-5.2 Pro and Harmonic's Aristotle theorem prover produced the first fully autonomous AI solution to an Erdős problem that hadn't already been solved in the existing literature.

The achievement marks a shift in what automated systems can contribute to mathematical research. These aren't competition problems with known solutions—they're genuine open questions that working mathematicians had failed to crack. Yet experts urge caution: Fields Medalist Terence Tao, who personally verified several of the AI-generated proofs, emphasizes these represent the 'lowest hanging fruit' from Erdős's collection—problems solvable with known techniques that simply hadn't received enough attention. The harder problems, requiring genuinely novel insights, remain out of reach.

Key Indicators

15
Erdős Problems Solved Since Christmas
Problems moved from 'open' to 'solved' on the official Erdős Problems website between December 25, 2025 and January 11, 2026
11
Solutions Crediting AI
Of the 15 recently solved problems, 11 specifically credited AI models as involved in the solution process
3
Full Autonomous AI Solutions
Problems fully solved by AI without prior human solutions in the literature: #728, #729, and #205
5/6
IMO 2025 Problems Solved
Harmonic's Aristotle achieved gold-medal performance at the 2025 International Mathematical Olympiad with formally verified proofs

Interactive

Exploring all sides of a story is often best achieved with Play.

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Sign Up

Debate Arena

Two rounds, two personas, one winner. You set the crossfire.

People Involved

Terence Tao
Terence Tao
Fields Medalist, UCLA Mathematics Professor (Actively coordinating AI-assisted mathematics research through erdosproblems.com)
Kevin Barreto
Kevin Barreto
Cambridge University Mathematics Student (Pioneering amateur use of AI theorem provers)
Paul Erdős
Paul Erdős
Hungarian Mathematician (1913-1996) (Deceased; legacy problems continue to drive research)

Organizations Involved

Harmonic
Harmonic
AI Research Company
Status: Developer of Aristotle theorem prover, central to recent breakthroughs

AI startup developing Aristotle, a theorem prover that achieved gold-medal performance at the 2025 International Mathematical Olympiad with formally verified proofs in Lean.

Google DeepMind
Google DeepMind
AI Research Laboratory
Status: Developed AlphaProof and AlphaEvolve systems contributing to mathematical discovery

Google's primary AI research division, which achieved silver-medal performance at IMO 2024 with AlphaProof and developed AlphaEvolve for mathematical exploration.

OpenAI
OpenAI
Artificial Intelligence Company (Public Benefit Corporation)
Status: GPT-5.2 Pro central to multiple Erdős problem solutions

Developer of the GPT model series; GPT-5.2 Pro released December 2025 has demonstrated strong mathematical reasoning capabilities.

Timeline

  1. AI Solutions Documented Across Multiple Problem Categories

    Documentation

    Terence Tao updates the erdosproblems.com wiki with comprehensive tracking of AI contributions across 10 categories, including solutions, partial results, literature reviews, and formalizations.

  2. GPT-5.2 Pro Solves Problems #729 and #205

    Solution

    GPT-5.2 Pro combined with Aristotle produces full Lean-verified solutions to Erdős Problems #729 and #205, extending the autonomous solution approach.

  3. AlphaProof Proves Variant of Problem #477

    Solution

    DeepMind's AlphaProof produces a Lean-verified proof of a variant of Erdős Problem #477.

  4. First Fully Autonomous AI Solution to Open Erdős Problem

    Breakthrough

    Kevin Barreto uses GPT-5.2 Pro and Aristotle to solve Erdős Problem #728 autonomously. Unlike previous solutions, no prior human proof exists in the literature. Terence Tao verifies the result.

  5. Claude Opus 4.5 Upgrades Partial Result to Full Solution

    Solution

    Anthropic's Claude Opus 4.5 and Gemini 3 Pro upgrade an existing partial result on Erdős Problem #871 to a full solution with Lean verification.

  6. Pace of AI-Assisted Solutions Accelerates

    Milestone

    Beginning Christmas Day, AI-assisted solutions to Erdős problems accelerate dramatically, with 15 problems moving to solved status over the following three weeks.

  7. OpenAI Releases GPT-5.2

    Technical

    OpenAI releases GPT-5.2 family including Pro variant with enhanced mathematical reasoning, which will prove central to subsequent Erdős breakthroughs.

  8. Aristotle Autonomously Solves Problem #1026

    Solution

    Boris Alexeev uses Aristotle to autonomously solve Erdős Problem #1026 in Lean. The next day, a prior human solution from 2016 is discovered in the literature.

  9. Aristotle Achieves First Erdős Partial Result

    Breakthrough

    Harmonic's Aristotle produces a partial solution to Erdős Problem #124, inspiring amateur mathematicians to systematically apply AI to the problem collection.

  10. First Human-AI Collaborative Erdős Solution

    Breakthrough

    Boris Alexeev, Wouter van Doorn, and Terence Tao collaborate with Gemini DeepThink and Aristotle to achieve partial solution to Erdős Problem #367.

  11. AlphaEvolve Applied to Erdős Problems

    Research

    DeepMind's AlphaEvolve system is tested on multiple Erdős problems, achieving slight improvements on some constructions but not matching past results on others.

  12. Harmonic Publishes Aristotle IMO-Level Theorem Prover

    Technical

    AI startup Harmonic releases paper on Aristotle, a theorem prover achieving gold-medal equivalent performance on 2025 IMO problems with Lean-verified proofs.

  13. DeepMind's AlphaProof Achieves IMO Silver Medal Standard

    Milestone

    Google DeepMind announces that AlphaProof and AlphaGeometry solved 4 of 6 International Mathematical Olympiad problems, the first AI system to reach silver-medal level with formally verified proofs.

  14. Paul Erdős Dies, Leaving Hundreds of Open Problems

    Background

    Hungarian mathematician Paul Erdős dies at 83, leaving a legacy of over 1,500 papers and hundreds of unsolved problems with cash bounties attached.

Scenarios

1

AI Solves Major Open Problem, Transforms Mathematical Research

Discussed by: MIT Technology Review, Nature commentary; optimistic AI researchers

Within 12-24 months, AI systems solve one of the higher-bounty Erdős problems or another famous conjecture, demonstrating ability to generate genuinely novel mathematical insights. This would trigger massive investment in AI-for-math tools and fundamentally change how research mathematics is conducted, with AI becoming standard collaborator on major papers.

2

Progress Plateaus at 'Low-Hanging Fruit'

Discussed by: Terence Tao, Fields Medalist Martin Hairer, Harvard mathematics faculty

AI systems continue solving problems from the accessible tail of Erdős's collection—those solvable with existing techniques that simply hadn't received attention—but make no progress on the core difficult problems requiring novel mathematical insights. The gap between solving competition problems and research problems proves larger than anticipated. AI becomes useful for routine tasks but doesn't transform discovery.

3

Verification Crisis Undermines Trust

Discussed by: Mathematics community forums, Lean proof assistant developers

A high-profile AI solution is later discovered to contain subtle errors that formal verification missed due to misformalization of the problem statement, or reliance on unstated axioms. The incident causes mathematicians to question all AI-generated results and require exhaustive human review, significantly slowing the pace of AI-assisted mathematics.

4

Human-AI Collaboration Becomes Standard Practice

Discussed by: Erdős problems community, AI for Math Initiative partners

Rather than autonomous solving, the dominant mode becomes tight human-AI collaboration: AI handles literature review, numerical exploration, and proof formalization while humans provide problem selection, creative insights, and strategic direction. This hybrid approach produces more results than either alone, and becomes the norm for mathematical research within five years.

Historical Context

Four Color Theorem Computer Proof (1976)

June 1976

What Happened

Kenneth Appel and Wolfgang Haken at the University of Illinois announced they had proved the four color theorem—that any map can be colored with just four colors so no adjacent regions share a color. Their proof required over 1,000 hours of computer time to verify 1,936 special cases. It was the first major mathematical theorem proved with substantial computer assistance.

Outcome

Short Term

The mathematical community reacted with 'equal parts celebration and dismay.' Many mathematicians refused to accept a proof humans couldn't verify by hand. Colleague William Tutte celebrated that they 'smote the kraken' while others despaired at computers 'encroaching on human ingenuity.'

Long Term

The proof gained grudging acceptance and established computer-assisted proof as legitimate, if controversial. It foreshadowed today's debates about AI-generated proofs and what constitutes mathematical understanding versus mere verification.

Why It's Relevant Today

The 1976 controversy over computer proofs mirrors today's debates about AI theorem provers. The key difference: modern formal verification in Lean provides stronger guarantees than 1976's case-checking, but questions about mathematical insight versus brute-force verification persist.

AlphaGo Defeats Lee Sedol (2016)

March 2016

What Happened

DeepMind's AlphaGo defeated world Go champion Lee Sedol 4-1 in a match watched by 200 million people. Game 2's 'Move 37'—a play that seemed wrong to expert commentators but proved decisive—demonstrated AI could find strategies humans had never considered in a game with more possible positions than atoms in the universe.

Outcome

Short Term

Lee Sedol described feeling 'speechless' and the match triggered intense global interest in AI capabilities. Top Go players began studying AI moves to improve their own play.

Long Term

AlphaGo's success demonstrated reinforcement learning could master complex domains, directly inspiring the AlphaProof architecture now being used for theorem proving. The paradigm of AI finding solutions humans couldn't see is now being tested in mathematics.

Why It's Relevant Today

Just as AlphaGo found moves that looked wrong but proved correct, AI theorem provers are now finding proof paths mathematicians hadn't explored. The question is whether mathematical insight can emerge from pattern-matching at scale, as game-playing strategy did.

Kepler Conjecture Formal Verification (2014)

August 2014

What Happened

Thomas Hales announced completion of the Flyspeck project, a 12-year effort to formally verify his 1998 computer-assisted proof of the Kepler conjecture (about optimal sphere packing). The original proof had been accepted only with 99% confidence because referees couldn't verify all computational components; Flyspeck provided complete formal verification in Isabelle and HOL Light proof assistants.

Outcome

Short Term

The verification resolved lingering doubts about the proof's correctness and demonstrated that major mathematical results could be machine-verified end-to-end.

Long Term

Flyspeck pioneered the formal verification methodology now used by Aristotle and AlphaProof. It established that Lean-style proof assistants could provide ironclad guarantees for complex mathematical arguments.

Why It's Relevant Today

The Kepler verification took 12 years of human effort. Today's AI systems can formalize proofs in hours or days. This acceleration is what makes the current moment transformative—not just that AI can find proofs, but that it can verify them at scale.

13 Sources: