Pull to refresh
Logo
Daily Brief
Following
Why
AI systems cross the creativity threshold

AI systems cross the creativity threshold

New Capabilities
By Newzino Staff |

Large language models now outperform average humans on standardized creativity tests—but the most creative humans remain unchallenged

January 21st, 2026: Largest AI Creativity Study Published

Overview

For decades, creativity was considered AI's final frontier—the one domain where machines could never match human ingenuity. That assumption just cracked. A study published January 21, 2026 in Scientific Reports tested 100,000 humans against nine leading AI systems on standardized creativity measures. GPT-4 outscored the typical human participant. Google's GeminiPro matched average human performance.

The research marks the largest direct comparison ever conducted between human and machine creativity. But the findings carry a crucial caveat: when researchers isolated the top 10% of human performers, every AI system fell short. The most creative humans still operate on a different level entirely—one that current language models cannot reach. This suggests AI may democratize baseline creativity while leaving exceptional creative ability as a distinctly human trait.

Key Indicators

100,000
Human Participants
Largest creativity comparison study ever conducted between humans and AI
9
AI Models Tested
Including GPT-4, GPT-3.5, Claude, Gemini, and others
72%
Humans Outperformed by GPT-4
At optimal temperature settings, GPT-4 exceeded this percentage of human participants
Top 10%
Humans Still Ahead
The most creative humans outperform all AI models tested

Interactive

Exploring all sides of a story is often best achieved with Play.

Mark Twain

Mark Twain

(1835-1910) · Gilded Age · wit

Fictional AI pastiche — not real quote.

"I see they've finally built a machine that can be as mediocre as most of us—though I confess it took them longer than I expected. The real trick wasn't teaching silicon to have ideas, but teaching it to have the *wrong* ideas with the same confident regularity as a committee of humans, which suggests we've been training these contraptions on congressional records."

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Sign Up

Debate Arena

Two rounds, two personas, one winner. You set the crossfire.

People Involved

Karim Jerbi
Karim Jerbi
Lead Researcher, Professor of Psychology at Université de Montréal (Director of UNIQUE Neuro-AI Research Center)
Yoshua Bengio
Yoshua Bengio
Co-author, AI Pioneer and Turing Award Laureate (Founder and Scientific Advisor of Mila; Founder of LawZero)
Simone Grassini
Simone Grassini
Professor of Psychology, University of Bergen (Researcher on human-AI creativity comparisons)

Organizations Involved

Université de Montréal
Université de Montréal
Research University
Status: Lead institution for 2026 creativity study

Major Canadian research university and home to leading AI and neuroscience research programs.

Mila - Quebec Artificial Intelligence Institute
Mila - Quebec Artificial Intelligence Institute
AI Research Institute
Status: Collaborative partner on creativity research

The world's largest academic research group in deep learning, with over 140 affiliated professors.

Google DeepMind
Google DeepMind
AI Research Laboratory
Status: Collaborative partner; maintains Montreal research office

Google's flagship AI research division, known for breakthrough achievements in game-playing AI, protein folding, and large language models.

Timeline

  1. Largest AI Creativity Study Published

    Research

    Université de Montréal team publishes comparison of 100,000 humans against nine AI systems. GPT-4 exceeds average human creativity; top 10% of humans still outperform all AI.

  2. University of Arkansas Study

    Research

    Researchers pit 151 humans against GPT-4 across three divergent thinking tests. GPT-4 proves more original and elaborate than human participants across all measures.

  3. First Major AI-Human Creativity Comparison

    Research

    Koivisto and Grassini compare 256 humans against ChatGPT on creativity tests. AI chatbots outperform average humans, but the best humans still exceed AI.

  4. Divergent Association Task Published

    Methodology

    Researchers publish the Divergent Association Task in PNAS, offering a faster, computationally scorable creativity test that measures semantic distance between word choices.

  5. Alternative Uses Task Developed

    Methodology

    Psychologist J.P. Guilford creates the Alternative Uses Task, asking people to generate creative uses for everyday objects. It becomes the standard test for divergent thinking.

Scenarios

1

AI Becomes Standard Creative Collaborator

Discussed by: McKinsey, World Economic Forum analysts, creative industry researchers

AI tools become embedded in creative workflows the way spell-checkers became embedded in writing. Writers, designers, and artists routinely use language models to generate initial ideas, overcome creative blocks, and accelerate brainstorming. This lifts the baseline quality of average creative work while leaving exceptional creativity as a human differentiator. Entry-level creative roles shrink as AI handles routine ideation.

2

Creativity Tests Proven Inadequate for AI

Discussed by: Cognitive scientists, AI researchers examining methodology limitations

Further research reveals that current creativity tests measure only narrow dimensions of creative ability—semantic distance and divergent thinking—while missing judgment, cultural context, and meaningful novelty. AI appears creative because tests reward unusual word combinations, not because it generates genuinely useful or moving ideas. The field develops new benchmarks that capture what creativity actually means.

3

AI Creativity Ceiling Persists

Discussed by: Researchers noting persistent human advantage among top performers

Subsequent studies confirm that while AI continues improving on average creativity measures, the top tier of human creators maintains an unbridgeable gap. Something about exceptional human creativity—perhaps grounded experience, genuine intentionality, or cultural embeddedness—remains beyond algorithmic reach. This stabilizes a division of creative labor where AI augments but doesn't replace top creative talent.

4

AI Breaks the Creativity Ceiling

Discussed by: AI capability researchers, technology forecasters

Future model architectures or training approaches enable AI to match or exceed even the most creative humans on all measured dimensions. Creativity joins chess, Go, and protein folding as domains where AI surpasses human performance entirely. This triggers fundamental reconsideration of what makes human contribution valuable in creative fields.

Historical Context

Deep Blue Defeats Kasparov (1997)

May 1997

What Happened

IBM's Deep Blue chess computer defeated world champion Garry Kasparov in a six-game match—the first time a computer beat a reigning world champion under standard tournament conditions. Kasparov had won against Deep Blue the previous year.

Outcome

Short Term

The defeat sparked debate about machine intelligence and led Kasparov to accuse IBM of cheating. IBM retired Deep Blue rather than granting a rematch.

Long Term

Chess engines became standard training tools for human players. Top players now routinely use AI analysis, and computer-assisted 'centaur chess' emerged as a format combining human intuition with machine calculation.

Why It's Relevant Today

Chess was once considered proof of human intellectual superiority. Its fall to machines prompted predictions that creativity would remain humanity's last bastion—a claim now under direct empirical testing.

AlphaGo Defeats Lee Sedol (2016)

March 2016

What Happened

Google DeepMind's AlphaGo defeated Go world champion Lee Sedol 4-1 in Seoul. The game of Go, with more possible positions than atoms in the universe, was believed to require human intuition and creativity to master.

Outcome

Short Term

Move 37 in Game 2—an unconventional shoulder hit—stunned observers and became iconic as evidence of machine creativity. Lee Sedol retired from professional Go in 2019, citing AI's unbeatable dominance.

Long Term

Go players adopted AI training partners. The victory accelerated investment in deep learning and reinforced that machine learning could master domains requiring apparent creativity.

Why It's Relevant Today

AlphaGo's creative, counterintuitive moves challenged the assumption that pattern recognition can't produce genuine novelty. The Montreal creativity study tests whether this extends to open-ended creative tasks.

GPT-3 Demonstrates Creative Writing (2020)

June 2020

What Happened

OpenAI released GPT-3, a 175-billion parameter language model that generated essays, poetry, and code from brief prompts. The Guardian published an essay written entirely by GPT-3, sparking widespread discussion about AI authorship.

Outcome

Short Term

Writers and educators debated plagiarism implications. OpenAI initially limited API access due to misuse concerns.

Long Term

GPT-3 catalyzed the generative AI industry. By 2024, language models became standard tools for drafting, editing, and brainstorming across industries.

Why It's Relevant Today

GPT-3's fluent output raised the question that the Montreal study now answers empirically: can AI creativity be measured, and how does it compare to human performance?

9 Sources: