Pull to refresh
Logo
Daily Brief
Following
Why Sign Up
Machine learning reveals bacteria carry far more antiviral defenses than scientists assumed

Machine learning reveals bacteria carry far more antiviral defenses than scientists assumed

New Capabilities
By Newzino Staff |

Two independent teams used AI to scan thousands of bacterial genomes, tripling estimates of how many genes bacteria dedicate to fighting viruses

Yesterday: Both studies featured in Nature's April 16 issue

Overview

Every major tool in genetic engineering — from the enzymes that cut DNA in the 1970s to CRISPR gene editing — started as a defense weapon bacteria use against viruses. Two research teams just revealed that bacteria carry roughly three times more of these weapons than anyone realized, identifying millions of antiviral proteins across tens of thousands of genomes using machine-learning models that can flag a new defense system in five minutes.

Why it matters

Every previous bacterial defense tool scientists repurposed — restriction enzymes, CRISPR — launched a multibillion-dollar industry; now there are thousands more to explore.

Key Indicators

2.39M
Predicted antiviral proteins identified
The Pasteur team's models flagged 2.39 million candidate defense proteins across more than 30,000 bacterial genomes.
Increase over previous estimates
Roughly 1.5% of a bacterium's genes serve antiviral functions — three times more than previously thought.
624
Defense proteins found in E. coli alone
The MIT team's DefensePredictor tool identified 624 defense-related proteins across 69 E. coli strains, over 100 of them previously unknown.
85%
Proteins with no known link to immunity
The vast majority of the Pasteur team's predicted defense proteins had never been connected to antiviral function before.
45%
Experimental validation rate
When the MIT team cloned 94 predicted defense systems into bacteria and exposed them to phages, nearly half protected against infection.

Interactive

Exploring all sides of a story is often best achieved with Play.

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Sign Up

Debate Arena

Two rounds, two personas, one winner. You set the crossfire.

People Involved

Organizations Involved

Timeline

  1. Both studies featured in Nature's April 16 issue

    Publication

    Nature's Volume 652, Issue 8110 featured coverage of the two simultaneously published Science papers, bringing the findings to the journal's broad readership and cementing the moment as a milestone in microbiology.

  2. SNIPE defense system characterized at MIT

    Discovery

    Michael Laub's lab at MIT published the characterization of SNIPE, a membrane-bound nuclease system that rapidly degrades invading phage DNA — one of the first newly predicted systems to receive detailed structural and mechanistic analysis.

  3. Nature covers both studies as a 'treasure trove'

    Publication

    Nature published a news feature highlighting both teams' findings, quoting researchers who said the field had been 'massively underestimating' the number of bacterial defense systems.

  4. Both machine-learning studies posted as preprints

    Publication

    The DeWeirdt (MIT/Broad) and Mordret (Pasteur) teams posted their studies to bioRxiv on the same day, signaling their simultaneous development of AI-powered approaches to bacterial defense discovery.

  5. Known defense system count passes 100 families

    Research

    After successive rounds of discovery by multiple labs, the catalog of experimentally validated bacterial defense system families exceeded 100 — a tenfold increase from the roughly 10 families known before 2018.

  6. Sorek discovers 10 new antiphage defense systems

    Discovery

    Rotem Sorek's lab at the Weizmann Institute published a systematic scan of 50,000 bacterial genomes using the defense-island strategy, discovering 10 previously unknown immune systems. This paper opened the current era of rapid defense system discovery.

  7. CRISPR-Cas9 repurposed as a gene-editing tool

    Discovery

    Jennifer Doudna and Emmanuelle Charpentier published their landmark paper showing CRISPR-Cas9 could be programmed to cut any DNA sequence, transforming a bacterial defense mechanism into the most powerful gene-editing tool in history.

  8. Defense islands formally defined in bacterial genomes

    Research

    Eugene Koonin's team at the National Center for Biotechnology Information published the foundational paper describing 'defense islands' — genomic regions where antiviral genes cluster — establishing the search strategy later teams would automate.

  9. CRISPR proven as a bacterial immune system

    Discovery

    Rodolphe Barrangou and Philippe Horvath at Danisco demonstrated experimentally that bacteria use CRISPR to acquire resistance to viruses, confirming it as an adaptive immune system.

  10. CRISPR sequences first noticed in E. coli

    Discovery

    Yoshizumi Ishino and colleagues at Osaka University observed unusual repeated DNA sequences in E. coli. Their function would remain unknown for two decades.

  11. Restriction enzymes discovered in bacteria

    Discovery

    Werner Arber, Hamilton Smith, and Daniel Nathans identified bacterial enzymes that cut foreign DNA at specific sequences — the first characterized bacterial defense mechanism, later foundational to all of genetic engineering.

Scenarios

1

Next-generation gene-editing tools emerge from the new protein catalog

Discussed by: Feng Zhang (Broad Institute), Singularity Hub, Nature news coverage

The pattern holds: restriction enzymes became cloning tools, CRISPR became a gene editor, and now researchers systematically screen the new catalog for programmable nucleases, RNA-guided systems, or novel enzymatic activities suitable for biotechnology. Within two to five years, one or more of the newly discovered defense proteins is adapted into a molecular tool that complements or improves on CRISPR-Cas9 — potentially smaller, more specific, or targeting previously inaccessible genome features. Zhang's lab, which already discovered Fanzor and OMEGA systems through similar mining efforts, is actively pursuing this path.

2

Defense protein databases accelerate antiviral drug development

Discussed by: Nature news, News-Medical, Phys.org coverage

Pharmaceutical researchers use the databases to identify bacterial antiviral mechanisms that could be adapted against human viral pathogens. The proteins' billions of years of evolutionary refinement against viral attack make them candidates for novel antiviral therapies, diagnostics, or prophylactics — particularly for viruses where current treatments are limited. This path is longer than tool development, likely requiring five to ten years for clinical applications, but the sheer scale of the candidate pool improves the odds of finding something therapeutically useful.

3

Machine-learning discovery becomes the default method for finding biological mechanisms

Discussed by: Science editors, computational biologists quoted in coverage

The success of DefensePredictor and the Pasteur team's models — finding in minutes what took years via traditional methods — establishes AI-powered protein function prediction as a standard first step in microbiology. Labs apply similar approaches to other poorly understood functional categories in bacterial genomes, such as signaling systems, metabolic pathways, and symbiotic mechanisms. The five-minute screening capability effectively ends the era of discovering bacterial systems one at a time.

4

Most predicted defense proteins turn out to be false positives or biochemically intractable

Discussed by: Implicit in the 45% experimental validation rate; standard caution in computational biology

The 45% validation rate for the MIT team's predictions means over half of the candidates did not protect bacteria in lab conditions. At the scale of 2.39 million predicted proteins, even a high false-positive rate leaves hundreds of thousands of genuine defense systems — but the practical challenge of experimentally characterizing each one could slow the translation to usable tools. The field may spend years sorting signal from noise in the expanded databases.

Historical Context

Restriction enzymes launch the biotech industry (1968–1978)

1968–1978

What Happened

Werner Arber, Hamilton Smith, and Daniel Nathans discovered that bacteria use restriction enzymes to chop up foreign viral DNA at specific sequences while protecting their own DNA through chemical modification. Herb Boyer and Stan Cohen used these enzymes in 1973 to cut and paste DNA from different organisms — the birth of recombinant DNA technology.

Outcome

Short Term

The trio shared the 1978 Nobel Prize in Physiology or Medicine. Boyer co-founded Genentech in 1976, launching the modern biotechnology industry.

Long Term

Restriction enzymes became the foundational toolkit for molecular biology, enabling DNA cloning, sequencing, forensics, and genetically engineered medicines — a multi-hundred-billion-dollar industry built on a bacterial defense mechanism.

Why It's Relevant Today

The pattern is identical: scientists discover how bacteria fight viruses, then repurpose the mechanism as a laboratory tool. The 2026 findings represent the largest-ever expansion of that source material.

CRISPR: from curiosity to Nobel Prize (1987–2020)

1987–2020

What Happened

Yoshizumi Ishino noticed strange repeated DNA sequences in E. coli in 1987. It took 20 years for Rodolphe Barrangou to prove these sequences were an adaptive immune system. In 2012, Jennifer Doudna and Emmanuelle Charpentier showed the system could be programmed to edit any gene. Within a year, Feng Zhang demonstrated it worked in human cells.

Outcome

Short Term

Doudna and Charpentier won the 2020 Nobel Prize in Chemistry. CRISPR-based therapies reached patients by 2023, when the first CRISPR medicine (Casgevy for sickle cell disease) was approved.

Long Term

CRISPR gene editing is now used across agriculture, medicine, and basic research worldwide. The $30 billion gene-editing market traces directly to an obscure bacterial defense mechanism that sat uncharacterized for two decades.

Why It's Relevant Today

CRISPR proves that a single bacterial defense system can reshape an entire industry. The new studies suggest there are thousands of undiscovered systems of comparable biochemical sophistication — any one of which could be the next CRISPR.

AlphaFold transforms protein science (2020–2022)

2020–2022

What Happened

DeepMind's AlphaFold2 solved the protein-folding problem in 2020, then released predicted structures for nearly every known protein in 2022 — over 200 million structures. The achievement earned Demis Hassabis and John Jumper the 2024 Nobel Prize in Chemistry.

Outcome

Short Term

Structural biologists gained instant access to protein shapes that previously required years of laboratory work to determine.

Long Term

AlphaFold established machine learning as a first-class tool in biology, creating the infrastructure and expectations that made protein language models like ESM2 — the backbone of DefensePredictor — possible.

Why It's Relevant Today

DefensePredictor and the Pasteur team's models are direct descendants of the AI-for-biology revolution that AlphaFold ignited. The same protein language models that predict structure now predict function, enabling the bacterial defense discovery at scale.

Sources

(10)