Pull to refresh
Logo
Daily Brief
Following
Why Ranks Sign Up
AI decodes the genome's dark matter

AI decodes the genome's dark matter

New Capabilities

DeepMind's Open-Source AlphaGenome Gives Scientists New Tools to Understand Non-Coding DNA

January 31st, 2026: Adoption Reaches 3,000 Scientists Globally

Overview

For twenty years after scientists sequenced the human genome, 98% of it remained essentially unreadable. The protein-coding genes were mapped, but the vast regulatory regions—the genome's operating system—stayed opaque. On January 28, 2026, Google DeepMind released the full source code for AlphaGenome, an artificial intelligence model that predicts how genetic variants in these non-coding regions affect gene regulation and disease.

Nearly 3,000 scientists in 160 countries have already used the model since its launch seven months ago, applying it to cancer, neurodegenerative disorders, and rare genetic conditions. The open-source release—covering code, model weights, and documentation—means any research institution can now run AlphaGenome locally on a single graphics processing unit, rather than accessing it only through DeepMind's servers. For the estimated 350 million people with undiagnosed rare conditions, this marks a significant expansion in the tools available to find answers hidden in their DNA.

Key Indicators

98%
Non-coding genome
The portion of human DNA that doesn't encode proteins, where most genetic variation occurs and where AlphaGenome focuses its predictions.
~3,000
Scientists using AlphaGenome
Researchers across 160 countries who have adopted the model since its June 2025 launch.
1 million
Base-pairs per input
The length of DNA sequence AlphaGenome can process at once—five times longer than its predecessor Enformer.
25 of 26
Benchmarks exceeded
Evaluations of variant effect prediction where AlphaGenome matched or outperformed the best existing models.

Voices

Curated perspectives — historical figures and your fellow readers.

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Play

Exploring all sides of a story is often best achieved with Play.

Log in to play. Track your picks, climb the leaderboards. Log in Sign Up
Predict 4 ways this could play out. Contrarian picks score more — points lock when the scenario resolves. Log in to play
Timeline Five events from this story — drag them oldest to newest. Log in to play
Connections Sixteen names from the news. Find the four hidden groups of four. Log in to play

People Involved

Organizations Involved

Timeline

April 2003 January 2026

11 events Latest: January 31st, 2026 · 4 months ago Showing 8 of 11
Tap a bar to jump to that date
  1. Adoption Reaches 3,000 Scientists Globally

    Latest Milestone

    Nearly 3,000 scientists across 160 countries are now using AlphaGenome for research into cancer, neurodegeneration, and rare diseases.

  2. AlphaGenome Paper Published and Code Released

    Open Source Release

    Nature publishes AlphaGenome paper. DeepMind releases full source code and model weights under Apache 2.0 license for non-commercial use, enabling local deployment on a single GPU.

  3. Mayo Clinic Hosts Undiagnosed Hackathon

    Application

    First U.S. Undiagnosed Hackathon brings 130 researchers from 28 countries. Using AlphaGenome and other tools, teams diagnose 6 cases in 48 hours, with 3 more diagnosed in follow-up.

  4. AlphaGenome Preprint and API Launch

    AI Development

    DeepMind releases AlphaGenome preprint and provides API access for non-commercial research. Scientists can access predictions through DeepMind's servers.

  5. Nobel Prize for AlphaFold

    Recognition

    Demis Hassabis and John Jumper awarded Nobel Prize in Chemistry for protein structure prediction, sharing the prize with David Baker.

  6. AlphaMissense Catalogues Protein-Coding Variants

    AI Release

    DeepMind releases AlphaMissense, classifying 89% of 71 million possible missense variants as likely pathogenic or benign. Focuses on the 2% of the genome that codes for proteins.

  7. Enformer Model Published

    AI Development

    DeepMind publishes Enformer in Nature Methods, demonstrating that deep learning can predict gene expression from DNA sequence up to 196,000 base-pairs.

  8. DeepMind Open-Sources AlphaFold

    Open Source Release

    DeepMind releases AlphaFold code and launches protein structure database with EMBL-EBI, eventually reaching 2 million users.

  9. AlphaFold Solves Protein Folding Problem

    AI Breakthrough

    DeepMind's AlphaFold2 wins CASP14 with near-experimental accuracy, solving a 50-year grand challenge in biology.

  10. ENCODE Project Publishes Major Findings

    Scientific Milestone

    Encyclopedia of DNA Elements project assigns biochemical functions to 80% of the genome, revealing millions of regulatory elements outside protein-coding regions.

  11. Human Genome Project Completed

    Scientific Milestone

    International consortium announces completion of human genome sequence, covering 99% of gene-containing regions. The $2.7 billion project finished two years ahead of schedule.

Historical Context

3 moments from history that rhyme with this story — and how they unfolded.

October 1990 – April 2003

Human Genome Project Completion (2003)

The $2.7 billion international effort sequenced 99% of human gene-containing regions, completing two years ahead of schedule. The project established that humans have approximately 20,000 protein-coding genes—far fewer than expected—accounting for just 1.5% of the genome. Scientists hoped the sequence would rapidly unlock the genetic basis of disease.

Then

The sequence enabled identification of genes linked to cystic fibrosis, breast cancer, and thousands of other conditions. Genome-wide association studies became the dominant research paradigm.

Now

Despite mapping disease-associated variants, most fell in non-coding regions where their effects remained mysterious. The '98% problem' became biology's next grand challenge.

Why this matters now

AlphaGenome directly addresses the unfulfilled promise of the Human Genome Project—understanding what the non-coding 98% actually does and how it contributes to disease.

July 2021

AlphaFold Open-Source Release (2021)

After AlphaFold2 solved the protein structure prediction problem at CASP14, DeepMind open-sourced the code and partnered with EMBL-EBI to create a freely accessible database of 200 million predicted protein structures. The move represented a departure from traditional commercial AI development.

Then

Over 2 million scientists accessed the database within two years. Research accelerated on enzyme design, drug discovery, and understanding disease mechanisms.

Now

Hassabis and Jumper won the 2024 Nobel Prize in Chemistry. The open-source model became the template for DeepMind's scientific AI releases, including AlphaMissense and AlphaGenome.

Why this matters now

The AlphaGenome release follows the same playbook: publish in Nature, then open-source for non-commercial use. DeepMind is betting that democratizing access accelerates scientific progress.

September 2012

ENCODE Project Findings (2012)

The Encyclopedia of DNA Elements project assigned biochemical functions to 80% of the genome, identifying nearly 3 million regulatory sites. The findings challenged the 'junk DNA' concept but also sparked controversy about whether biochemical activity equals biological function.

Then

Researchers gained a map of potential regulatory elements but lacked tools to predict how specific variants in these regions affected gene expression or disease risk.

Now

ENCODE established that non-coding regions contain essential regulatory machinery. The project continues through phase 4, generating data that trained models like AlphaGenome.

Why this matters now

ENCODE catalogued where regulatory elements exist; AlphaGenome predicts what happens when mutations occur in them. The two represent complementary approaches to decoding the non-coding genome.

Sources

(10)