Pull to refresh
Logo
Daily Brief
Following
Why
The AI science rush

The AI science rush

New Capabilities
By Newzino Staff | |

Large language models earn recognition for frontier research—and expose cracks in scientific publishing

January 22nd, 2026: NeurIPS Papers Contaminated by AI Hallucinations

Overview

Science magazine named large language models doing frontier science a runner-up breakthrough of 2025. Within weeks, the prediction became reality: OpenAI's GPT-5.2 solved previously unsolved Erdős mathematics problems in 15 minutes, achieving 40% accuracy on expert-level mathematics that stumped earlier systems. DeepMind announced its first automated laboratory in the UK for 2026, pairing Gemini with robotics to synthesize hundreds of materials daily. Google partnered with the U.S. Department of Energy on Genesis, a national AI-for-science platform mobilizing 17 national laboratories.

Key Indicators

40.3%
GPT-5.2 expert math accuracy
OpenAI's GPT-5.2 Thinking solved 40.3% of FrontierMath problems, previously unsolvable by AI; helped solve 11 open Erdős problems
21%
ICLR 2026 reviews fully AI-generated
Analysis found 15,899 of 75,800 peer reviews at major AI conference were entirely written by AI, triggering integrity crisis
129
Papers retracted (single journal, 2025)
Springer Nature's Neurosurgical Review retracted 129 AI-generated papers in 2025 alone, symptomatic of systemic collapse
$3B
Isomorphic Labs partnerships
DeepMind spin-out secured $3 billion from Eli Lilly and Novartis; first clinical trials delayed to late 2026
50%+
Research output increase for LLM users
Scientists flagged as using LLMs posted 33-50% more papers on arXiv, bioRxiv, and SSRN

Interactive

Exploring all sides of a story is often best achieved with Play.

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Sign Up

Debate Arena

Two rounds, two personas, one winner. You set the crossfire.

People Involved

Demis Hassabis
Demis Hassabis
CEO and Co-founder, Google DeepMind (CEO of Google DeepMind; announced UK automated lab and Genesis DOE partnership)
Ekin Dogus Cubuk
Ekin Dogus Cubuk
Co-founder, Periodic Labs (Leading $300M AI science startup)
Gary Peltz
Gary Peltz
Liver Disease Researcher, Stanford Medicine (Testing Google's AI co-scientist for drug repurposing)
Max Spero
Max Spero
CEO, Pangram Labs (Exposed ICLR 2026 AI-generated peer review crisis)

Organizations Involved

Google DeepMind
Google DeepMind
AI Research Laboratory
Status: Opening first automated laboratory in UK 2026; partnering with U.S. DOE on Genesis

Alphabet's AI research lab pioneering large language models for scientific discovery across mathematics, biology, and materials science.

Periodic Labs
Periodic Labs
AI Science Startup
Status: Building automated scientific discovery platform

AI startup automating scientific discovery, initially targeting superconductor invention.

Lila Sciences
Lila Sciences
AI Science Startup
Status: Building scientific superintelligence with automated labs

Startup pairing specialized AI models with automated laboratories to accelerate biological research.

Sakana AI
Sakana AI
AI Research Startup
Status: AI Scientist system faces critical evaluation showing quality concerns

Tokyo-based startup that created The AI Scientist system for fully automated research.

Science Magazine
Science Magazine
Scientific Journal
Status: Recognized LLMs as 2025 Breakthrough runner-up

Peer-reviewed academic journal published by the American Association for the Advancement of Science.

Isomorphic Labs
Isomorphic Labs
AI Drug Discovery Startup
Status: First clinical trials expected late 2026

DeepMind spin-out using AI for drug discovery, backed by $3 billion in pharma partnerships.

Pangram Labs
Pangram Labs
AI Detection Company
Status: Exposed ICLR 2026 peer review crisis

Developer of AI detection tools for academic integrity and plagiarism detection.

U.
U.S. Department of Energy
Federal Agency
Status: Partnering with DeepMind on Genesis AI-for-science platform

U.S. federal agency overseeing national laboratories and energy research infrastructure.

Timeline

  1. NeurIPS Papers Contaminated by AI Hallucinations

    Publication Crisis

    Analysis reveals NeurIPS submissions contain hallucinated citations and sources invented by generative AI models as submissions increased 220% since 2020.

  2. Isomorphic Labs Delays Clinical Trial Timeline

    Business Development

    CEO Demis Hassabis announces first AI-designed drug trials expected by end of 2026, delayed from 2025 target despite $3B pharma partnerships.

  3. GPT-5.2 Solves Erdős Mathematics Problems

    Research Breakthrough

    OpenAI's GPT-5.2 achieves 40.3% on expert-level FrontierMath; helps solve 11 previously open Erdős problems in 15 minutes of processing.

  4. OpenAI Releases GPT-5.2 for Science and Math

    Technology Release

    GPT-5.2 Pro and Thinking achieve 93.2% on graduate-level GPQA Diamond benchmark; set new state-of-the-art for expert mathematics at 29.2% on FrontierMath Tier 4.

  5. Study Shows 50% Output Boost, Quality Concerns

    Research Publication

    Cornell study finds LLM users publish 33-50% more papers, but AI-polished work less likely to be accepted.

  6. Science Names LLMs Breakthrough Runner-Up

    Recognition

    Science magazine recognizes large language models doing frontier science as 2025 Breakthrough of the Year runner-up.

  7. GPT-5 Achieves 79x Lab Efficiency Gain

    Research Breakthrough

    OpenAI announces GPT-5 optimized gene-editing protocol in real wet lab, introducing novel enzyme mechanism.

  8. DeepMind Announces UK Automated Laboratory

    Infrastructure

    Google DeepMind partners with UK government to open first automated materials science lab in 2026, using Gemini and robotics to synthesize hundreds of materials daily.

  9. ICLR 2026 AI-Generated Review Scandal

    Publication Crisis

    Pangram Labs analysis finds 21% of 75,800 peer reviews for major AI conference were fully AI-generated; over 50% showed AI involvement.

  10. Research Integrity Conference Hit by AI Abstracts

    Publication Crisis

    2026 World Conference on Research Integrity discovers substantial submitted abstracts showing generative AI use, detected by Copyleaks plagiarism software.

  11. Lila Sciences Hits $1.3B Valuation

    Funding

    AI lab raises $115M extension from Nvidia, General Catalyst, bringing total raised to $550M for automated laboratories.

  12. Periodic Labs Raises $300M Seed

    Funding

    Former OpenAI and DeepMind researchers raise one of largest seed rounds ever from a16z, Nvidia, Bezos.

  13. AI Co-Scientist Validates Drug Candidates

    Research Validation

    Stanford publishes findings: Google's AI co-scientist identified liver fibrosis drugs in days versus years of human research.

  14. Gemini Wins Gold at Math Olympiad

    Competition Victory

    DeepMind's Gemini Deep Think scores 35/42 points, achieving gold medal standard at International Mathematical Olympiad.

  15. First AI-Generated Peer-Reviewed Paper

    Publication

    Sakana's AI Scientist v2 produces first entirely AI-generated paper accepted at ICLR workshop after peer review.

  16. Springer Journal Retracts 129 AI Papers

    Publication Crisis

    Neurosurgical Review retracts 129 papers after investigation finds articles with strong LLM-generation indicators submitted without proper disclosure.

  17. First AI-Enabled Nobel Prize

    Recognition

    Hassabis and Jumper win Nobel Prize in Chemistry for AlphaFold—first Nobel recognizing AI-enabled scientific discovery.

  18. Sakana Releases The AI Scientist

    Technology Release

    Sakana AI unveils open-source system automating entire research lifecycle from hypothesis to paper writing.

  19. AlphaFold Database Launched

    Platform Release

    DeepMind releases open-source AlphaFold and database with 360,000 protein structures; now exceeds 200 million predictions.

  20. AlphaFold2 Solves Protein Folding

    Scientific Breakthrough

    DeepMind's AlphaFold2 predicts protein structures with unprecedented accuracy, solving a 50-year grand challenge in biology.

Scenarios

1

AI Scientists Become Standard Research Partners by 2027

Discussed by: Google Research, Stanford Medicine researchers, pharmaceutical industry analysts

LLMs evolve from hypothesis generators to full research collaborators integrated into every lab. Isomorphic Labs and similar companies advance multiple AI-designed drug candidates through clinical trials. Research productivity accelerates dramatically, particularly for non-English-native scientists. The shift resembles how computational modeling became standard in the 1990s—initially controversial, ultimately indispensable. Traditional wet-lab work continues but AI handles literature review, experiment design, and optimization. The bottleneck shifts from generating ideas to validating them.

2

Publishing Crisis Forces Fundamental Restructuring

Discussed by: Nature, Science editors; peer review researchers; academic integrity watchdogs

The 50% productivity boost from LLMs, combined with paper mill fraud overwhelming peer review, triggers systemic failure. Retraction rates explode beyond 10,000 annually. Major journals implement mandatory AI detection and human verification protocols, creating multi-tier publication systems. Traditional peer review gives way to post-publication validation by specialized AI. Some fields abandon journals entirely for open preprint repositories with community verification. The transition is messy—prestigious labs see papers rejected despite solid science because reviewers can't distinguish from AI-polished junk. A two-year credibility crisis precedes new equilibrium.

3

AI Science Bubble Bursts After High-Profile Failures

Discussed by: MIT Technology Review, skeptical AI researchers, investors tracking 2025 hype correction

The $1.6 billion invested in AI science startups hits reality. Sakana's peer-reviewed paper is retracted after researchers discover it plagiarized established concepts. Periodic Labs and Lila Sciences struggle to translate computational predictions into real-world materials and drugs—the IBM Watson pattern repeats. OpenAI's 79x efficiency gain proves non-replicable outside controlled settings or requires human oversight that negates speed advantages. VCs who rushed in during 2025's gold rush write down valuations. The field doesn't die but resets expectations—AI becomes a useful tool rather than autonomous scientist, similar to how self-driving cars recalibrated from 'five years away' to gradual deployment.

4

Regulatory Intervention Fragments AI Science Development

Discussed by: Biosecurity researchers, EU AI Act implementers, academic integrity organizations

GPT-5's gene-editing optimization and autonomous lab systems trigger regulatory scrutiny. The EU, US, and China implement divergent rules: Europe restricts AI-designed biological experiments without oversight committees; China accelerates deployment in state labs; the US creates a patchwork of agency-specific regulations. Red Queen Bio's biosecurity focus becomes mandatory industry standard. Research bifurcates—computational discovery continues unrestricted, but autonomous wet-lab work faces lengthy approval processes. Brain drain accelerates toward permissive jurisdictions. By 2027, regulatory fragmentation creates scientific balkanization where breakthrough location depends more on legal framework than technical capability.

5

Dual-Track Science Emerges: AI-Augmented vs. AI-Generated

Discussed by: ICLR 2026 organizers, Nature editors, Pangram Labs CEO Max Spero

The ICLR 2026 scandal triggers formal bifurcation of scientific publishing. Journals create separate tracks: Track A requires full disclosure of AI use with mandatory detection screening, accepting that AI is a tool like statistical software. Track B prohibits AI entirely, requiring human-verified work with stricter review but slower publication. Prestigious institutions split—some departments mandate Track A for productivity, others ban AI to preserve 'pure' research credentials. By 2027, citation networks show minimal crossover between tracks. Scientists face career-defining choice: embrace AI and risk legitimacy concerns, or reject it and fall behind in output. The division mirrors experimental vs. theoretical physics—eventually both contribute, but operate in distinct ecosystems.

Historical Context

AlphaFold and the 2024 Nobel Prize

2020-2024

What Happened

DeepMind's AlphaFold solved the 50-year protein folding problem in 2020, predicting structures for 200 million proteins. The breakthrough earned Demis Hassabis and John Jumper the 2024 Nobel Prize in Chemistry—the first Nobel recognizing AI-enabled scientific discovery. Over 500,000 researchers adopted the open-source tool, generating thousands of papers on antibiotic resistance, drug discovery, and crop resilience.

Outcome

Short Term

Proved AI could solve grand scientific challenges previously requiring decades of human effort; validated massive investment in AI for science.

Long Term

Established template for LLMs tackling fundamental research problems; demonstrated that open-sourcing breakthrough AI accelerates global science rather than limiting it.

Why It's Relevant Today

AlphaFold's Nobel Prize validated AI's scientific capability one year before Science magazine recognized LLMs as 2025 breakthrough runners-up, creating momentum for the current AI science investment rush.

IBM Watson's Failed Drug Discovery Bet

2011-2019

What Happened

After Watson's 2011 Jeopardy triumph, IBM launched Watson for Drug Discovery, promising to revolutionize medicine by analyzing massive biomedical datasets. Major partnerships with Pfizer (2016) and MD Anderson followed. By 2019, IBM quietly discontinued the product after high-profile failures, including MD Anderson scrapping their installation. Watson made only small advances in drug discovery despite years of investment and publicity.

Outcome

Short Term

Multi-million dollar partnerships dissolved; IBM shifted focus away from drug discovery toward clinical applications with more modest claims.

Long Term

Created investor skepticism about AI drug discovery claims; established cautionary tale about over-promising AI capabilities before validation.

Why It's Relevant Today

Watson's trajectory warns that today's $1.6 billion in AI science startup funding could face similar reality checks when computational predictions must translate to real-world lab results.

The Paper Mill Crisis and Peer Review Collapse

2020-2025

What Happened

Paper retractions passed 10,000 in 2023, quadrupling over 20 years with the majority due to misconduct. In 2023, Hindawi retracted over 8,000 papers from organized paper mills selling fraudulent authorship. By 2025, AI tools like ChatGPT enabled industrial-scale generation of plagiarism-resistant junk science. Experts estimate one in 50 papers now show paper mill patterns, overwhelming traditional peer review gatekeeping.

Outcome

Short Term

Major publishers implemented AI detection tools; peer review timelines lengthened; journal credibility declined in affected fields.

Long Term

Forced fundamental questioning of peer review's viability; created opening for AI-assisted review systems despite concerns about AI reviewing AI-generated content.

Why It's Relevant Today

The existing publishing crisis means LLMs' 50% productivity boost arrives precisely when the system can least handle increased volume, potentially triggering complete restructuring of scientific validation.

Sources

(27)