Pull to refresh
Logo
Daily Brief
Following
Why Ranks Sign Up
AI systems cross the creativity threshold

AI systems cross the creativity threshold

New Capabilities

Large language models now outperform average humans on standardized creativity tests—but the most creative humans remain unchallenged

January 21st, 2026: Largest AI Creativity Study Published

Overview

For decades, creativity was considered AI's final frontier—the one domain where machines could never match human ingenuity. That assumption just cracked.

A study published January 21, 2026 in Scientific Reports tested 100,000 humans against nine leading AI systems on standardized creativity measures. GPT-4 outscored the typical human participant, while Google's GeminiPro matched average human performance. The research marks the largest direct comparison ever conducted between human and machine creativity.

But the findings have a caveat: when researchers isolated the top 10% of human performers, every AI system fell short. The most creative humans still operate on a different level entirely—one that current language models cannot reach. This suggests AI may democratize baseline creativity while leaving exceptional creative ability as a distinctly human trait.

Play on this story Voices Debate Predict

Key Indicators

100,000
Human Participants
Largest creativity comparison study ever conducted between humans and AI
9
AI Models Tested
Including GPT-4, GPT-3.5, Claude, Gemini, and others
72%
Humans Outperformed by GPT-4
At optimal temperature settings, GPT-4 exceeded this percentage of human participants
Top 10%
Humans Still Ahead
The most creative humans outperform all AI models tested

Voices

Curated perspectives — historical figures and your fellow readers.

Mark Twain

Mark Twain

(1835-1910) · Gilded Age · wit

Fictional AI pastiche — not real quote.

"I see they've finally built a machine that can be as mediocre as most of us—though I confess it took them longer than I expected. The real trick wasn't teaching silicon to have ideas, but teaching it to have the *wrong* ideas with the same confident regularity as a committee of humans, which suggests we've been training these contraptions on congressional records."

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Play

Exploring all sides of a story is often best achieved with Play.

Log in to play. Track your picks, climb the leaderboards. Log in Sign Up
Predict 4 ways this could play out. Contrarian picks score more — points lock when the scenario resolves. Log in to play
Connections Sixteen names from the news. Find the four hidden groups of four. Log in to play

People Involved

Organizations Involved

Timeline

1967 January 2026

5 events Latest: January 21st, 2026 · 4 months ago
Tap a bar to jump to that date
  1. Largest AI Creativity Study Published

    Latest Research

    Université de Montréal team publishes comparison of 100,000 humans against nine AI systems. GPT-4 exceeds average human creativity; top 10% of humans still outperform all AI.

  2. University of Arkansas Study

    Research

    Researchers pit 151 humans against GPT-4 across three divergent thinking tests. GPT-4 proves more original and elaborate than human participants across all measures.

  3. First Major AI-Human Creativity Comparison

    Research

    Koivisto and Grassini compare 256 humans against ChatGPT on creativity tests. AI chatbots outperform average humans, but the best humans still exceed AI.

  4. Divergent Association Task Published

    Methodology

    Researchers publish the Divergent Association Task in PNAS, offering a faster, computationally scorable creativity test that measures semantic distance between word choices.

  5. Alternative Uses Task Developed

    Methodology

    Psychologist J.P. Guilford creates the Alternative Uses Task, asking people to generate creative uses for everyday objects. It becomes the standard test for divergent thinking.

Historical Context

3 moments from history that rhyme with this story — and how they unfolded.

May 1997

Deep Blue Defeats Kasparov (1997)

IBM's Deep Blue chess computer defeated world champion Garry Kasparov in a six-game match—the first time a computer beat a reigning world champion under standard tournament conditions. Kasparov had won against Deep Blue the previous year.

Then

The defeat sparked debate about machine intelligence and led Kasparov to accuse IBM of cheating. IBM retired Deep Blue rather than granting a rematch.

Now

Chess engines became standard training tools for human players. Top players now routinely use AI analysis, and computer-assisted 'centaur chess' emerged as a format combining human intuition with machine calculation.

Why this matters now

Chess was once considered proof of human intellectual superiority. Its fall to machines prompted predictions that creativity would remain humanity's last bastion—a claim now under direct empirical testing.

March 2016

AlphaGo Defeats Lee Sedol (2016)

Google DeepMind's AlphaGo defeated Go world champion Lee Sedol 4-1 in Seoul. The game of Go, with more possible positions than atoms in the universe, was believed to require human intuition and creativity to master.

Then

Move 37 in Game 2—an unconventional shoulder hit—stunned observers and became iconic as evidence of machine creativity. Lee Sedol retired from professional Go in 2019, citing AI's unbeatable dominance.

Now

Go players adopted AI training partners. The victory accelerated investment in deep learning and reinforced that machine learning could master domains requiring apparent creativity.

Why this matters now

AlphaGo's creative, counterintuitive moves challenged the assumption that pattern recognition can't produce genuine novelty. The Montreal creativity study tests whether this extends to open-ended creative tasks.

June 2020

GPT-3 Demonstrates Creative Writing (2020)

OpenAI released GPT-3, a 175-billion parameter language model that generated essays, poetry, and code from brief prompts. The Guardian published an essay written entirely by GPT-3, sparking widespread discussion about AI authorship.

Then

Writers and educators debated plagiarism implications. OpenAI initially limited API access due to misuse concerns.

Now

GPT-3 catalyzed the generative AI industry. By 2024, language models became standard tools for drafting, editing, and brainstorming across industries.

Why this matters now

GPT-3's fluent output raised the question that the Montreal study now answers empirically: can AI creativity be measured, and how does it compare to human performance?

Sources

(9)