AI systems cross the creativity threshold

Overview

For decades, creativity was considered AI's final frontier—the one domain where machines could never match human ingenuity. That assumption just cracked. A study published January 21, 2026 in Scientific Reports tested 100,000 humans against nine leading AI systems on standardized creativity measures. GPT-4 outscored the typical human participant. Google's GeminiPro matched average human performance.

The research marks the largest direct comparison ever conducted between human and machine creativity. But the findings carry a crucial caveat: when researchers isolated the top 10% of human performers, every AI system fell short. The most creative humans still operate on a different level entirely—one that current language models cannot reach. This suggests AI may democratize baseline creativity while leaving exceptional creative ability as a distinctly human trait.

9 Sources:

Key Indicators

100,000

Human Participants

Largest creativity comparison study ever conducted between humans and AI

AI Models Tested

Including GPT-4, GPT-3.5, Claude, Gemini, and others

72%

Humans Outperformed by GPT-4

At optimal temperature settings, GPT-4 exceeded this percentage of human participants

Top 10%

Humans Still Ahead

The most creative humans outperform all AI models tested

Interactive

Exploring all sides of a story is often best achieved with Play.

Mark Twain

(1835-1910) · Gilded Age · wit

Fictional AI pastiche — not real quote.

"I see they've finally built a machine that can be as mediocre as most of us—though I confess it took them longer than I expected. The real trick wasn't teaching silicon to have ideas, but teaching it to have the *wrong* ideas with the same confident regularity as a committee of humans, which suggests we've been training these contraptions on congressional records."

Ever wondered what historical figures would say about today's headlines?

Debate Arena

Two rounds, two personas, one winner. You set the crossfire.

People Involved

Karim Jerbi

Lead Researcher, Professor of Psychology at Université de Montréal (Director of UNIQUE Neuro-AI Research Center)

Yoshua Bengio

Co-author, AI Pioneer and Turing Award Laureate (Founder and Scientific Advisor of Mila; Founder of LawZero)

Simone Grassini

Professor of Psychology, University of Bergen (Researcher on human-AI creativity comparisons)

Organizations Involved

Université de Montréal

Research University

Status: Lead institution for 2026 creativity study

Major Canadian research university and home to leading AI and neuroscience research programs.

Mila - Quebec Artificial Intelligence Institute

AI Research Institute

Status: Collaborative partner on creativity research

The world's largest academic research group in deep learning, with over 140 affiliated professors.

Google DeepMind

AI Research Laboratory

Status: Collaborative partner; maintains Montreal research office

Google's flagship AI research division, known for breakthrough achievements in game-playing AI, protein folding, and large language models.

Timeline

Largest AI Creativity Study Published
Research

Université de Montréal team publishes comparison of 100,000 humans against nine AI systems. GPT-4 exceeds average human creativity; top 10% of humans still outperform all AI.
January 21st, 2026
University of Arkansas Study
Research

Researchers pit 151 humans against GPT-4 across three divergent thinking tests. GPT-4 proves more original and elaborate than human participants across all measures.
February 2024
First Major AI-Human Creativity Comparison
Research

Koivisto and Grassini compare 256 humans against ChatGPT on creativity tests. AI chatbots outperform average humans, but the best humans still exceed AI.
August 2023
Divergent Association Task Published
Methodology

Researchers publish the Divergent Association Task in PNAS, offering a faster, computationally scorable creativity test that measures semantic distance between word choices.
June 2021
Alternative Uses Task Developed
Methodology

Psychologist J.P. Guilford creates the Alternative Uses Task, asking people to generate creative uses for everyday objects. It becomes the standard test for divergent thinking.
1967

Scenarios

AI Becomes Standard Creative Collaborator

Discussed by: McKinsey, World Economic Forum analysts, creative industry researchers

AI tools become embedded in creative workflows the way spell-checkers became embedded in writing. Writers, designers, and artists routinely use language models to generate initial ideas, overcome creative blocks, and accelerate brainstorming. This lifts the baseline quality of average creative work while leaving exceptional creativity as a human differentiator. Entry-level creative roles shrink as AI handles routine ideation.

Creativity Tests Proven Inadequate for AI

Discussed by: Cognitive scientists, AI researchers examining methodology limitations

Further research reveals that current creativity tests measure only narrow dimensions of creative ability—semantic distance and divergent thinking—while missing judgment, cultural context, and meaningful novelty. AI appears creative because tests reward unusual word combinations, not because it generates genuinely useful or moving ideas. The field develops new benchmarks that capture what creativity actually means.

AI Creativity Ceiling Persists

Discussed by: Researchers noting persistent human advantage among top performers

Subsequent studies confirm that while AI continues improving on average creativity measures, the top tier of human creators maintains an unbridgeable gap. Something about exceptional human creativity—perhaps grounded experience, genuine intentionality, or cultural embeddedness—remains beyond algorithmic reach. This stabilizes a division of creative labor where AI augments but doesn't replace top creative talent.

AI Breaks the Creativity Ceiling

Discussed by: AI capability researchers, technology forecasters

Future model architectures or training approaches enable AI to match or exceed even the most creative humans on all measured dimensions. Creativity joins chess, Go, and protein folding as domains where AI surpasses human performance entirely. This triggers fundamental reconsideration of what makes human contribution valuable in creative fields.

Historical Context

Deep Blue Defeats Kasparov (1997)

May 1997

What Happened

IBM's Deep Blue chess computer defeated world champion Garry Kasparov in a six-game match—the first time a computer beat a reigning world champion under standard tournament conditions. Kasparov had won against Deep Blue the previous year.

Outcome

Short Term

The defeat sparked debate about machine intelligence and led Kasparov to accuse IBM of cheating. IBM retired Deep Blue rather than granting a rematch.

Long Term

Chess engines became standard training tools for human players. Top players now routinely use AI analysis, and computer-assisted 'centaur chess' emerged as a format combining human intuition with machine calculation.

Why It's Relevant Today

Chess was once considered proof of human intellectual superiority. Its fall to machines prompted predictions that creativity would remain humanity's last bastion—a claim now under direct empirical testing.

AlphaGo Defeats Lee Sedol (2016)

March 2016

What Happened

Google DeepMind's AlphaGo defeated Go world champion Lee Sedol 4-1 in Seoul. The game of Go, with more possible positions than atoms in the universe, was believed to require human intuition and creativity to master.

Outcome

Short Term

Move 37 in Game 2—an unconventional shoulder hit—stunned observers and became iconic as evidence of machine creativity. Lee Sedol retired from professional Go in 2019, citing AI's unbeatable dominance.

Long Term

Go players adopted AI training partners. The victory accelerated investment in deep learning and reinforced that machine learning could master domains requiring apparent creativity.

Why It's Relevant Today

AlphaGo's creative, counterintuitive moves challenged the assumption that pattern recognition can't produce genuine novelty. The Montreal creativity study tests whether this extends to open-ended creative tasks.

GPT-3 Demonstrates Creative Writing (2020)

June 2020

What Happened

OpenAI released GPT-3, a 175-billion parameter language model that generated essays, poetry, and code from brief prompts. The Guardian published an essay written entirely by GPT-3, sparking widespread discussion about AI authorship.

Outcome

Short Term

Writers and educators debated plagiarism implications. OpenAI initially limited API access due to misuse concerns.

Long Term

GPT-3 catalyzed the generative AI industry. By 2024, language models became standard tools for drafting, editing, and brainstorming across industries.

Why It's Relevant Today

GPT-3's fluent output raised the question that the Montreal study now answers empirically: can AI creativity be measured, and how does it compare to human performance?

AI systems cross the creativity threshold

Overview

Key Indicators

Related Media

Interactive

Mark Twain

Ever wondered what historical figures would say about today's headlines?

Preview Voice

Generating Voice

Choose a Historical Figure

Albert Einstein

Ambrose Bierce

Andrew Carnegie

Andrew Mellon

Ayn Rand

Benjamin Franklin

Cecil Rhodes

Charles Darwin

Cornelius Vanderbilt

Dorothy Parker

Eleanor Roosevelt

Frederick Douglass

G. K. Chesterton

George Orwell

H. L. Mencken

Hannah Arendt

J. P. Morgan

James Baldwin

Jamsetji Tata

Jane Addams

John Locke

Jonathan Swift

Madam C. J. Walker

Mark Twain

Mary Wollstonecraft

Niccolo Machiavelli

Oscar Wilde

Rachel Carson

Samuel Johnson

Simone Weil

Sojourner Truth

Thomas Hobbes

Thomas Jefferson

Thomas Paine

Voltaire

Winston Churchill

Debate Arena

People Involved

Organizations Involved

Timeline

Largest AI Creativity Study Published

University of Arkansas Study

First Major AI-Human Creativity Comparison

Divergent Association Task Published

Alternative Uses Task Developed

Scenarios

AI Becomes Standard Creative Collaborator

Creativity Tests Proven Inadequate for AI

AI Creativity Ceiling Persists

AI Breaks the Creativity Ceiling

Historical Context

Deep Blue Defeats Kasparov (1997)

What Happened

Outcome

Why It's Relevant Today

AlphaGo Defeats Lee Sedol (2016)

What Happened

Outcome

Why It's Relevant Today

GPT-3 Demonstrates Creative Writing (2020)

What Happened

Outcome

Why It's Relevant Today

Related Stories

The AI science rush

Google Gemini's push toward scientific reasoning

The recursive loop begins

The AI reasoning revolution

AI systems begin solving historic Erdős mathematical problems

AI-driven autonomous labs transform materials discovery