Pull to refresh
Logo
Daily Brief
Following
Why Ranks Sign Up
Helix builds North America's largest linked clinico-genomic dataset

Helix builds North America's largest linked clinico-genomic dataset

New Capabilities

Precision health company crosses 500,000 records and launches its first AI discovery tool

Today: Helix hits 500,000 records and launches AI tools

Overview

Helix said its GenoSphere platform now holds more than 500,000 linked clinico-genomic records. That makes it the largest such dataset in North America. The San Mateo company also released the first of a planned set of artificial intelligence tools built to mine that data for drug targets.

A linked record pairs one patient's full genetic sequence with their long-term electronic health record. At half a million records, researchers can spot rare variants tied to disease and watch how patients with specific genotypes respond to treatment. That scale is what pharmaceutical companies pay for.

Why it matters

Drug discovery moves faster when companies can match genetic variants to real patient outcomes, and this dataset gives pharma a half-million Americans to study.

Questions about this story

No questions yet — be the first to ask.

Key Indicators

500K+
Linked clinico-genomic records
Each record pairs full genetic sequencing with longitudinal electronic health record data.
4
Disease areas in initial scope
Cardiovascular, metabolic, immunology, and inflammation disorders.
1st
AI tool in planned suite
Helix released its first AI discovery tool, with more queued for launch.
500K
UK Biobank participants, for comparison
The UK Biobank, opened in 2006, is the global benchmark for linked population biobanks.

Voices

Curated perspectives — historical figures and your fellow readers.

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Play

Exploring all sides of a story is often best achieved with Play.

Log in to play. Track your picks, climb the leaderboards. Log in Sign Up
Predict 4 ways this could play out. Contrarian picks score more — points lock when the scenario resolves. Log in to play
Higher or Lower Two numbers from this story. Guess which is bigger. 5 rounds to set a streak. Log in to play
Connections Sixteen names from the news. Find the four hidden groups of four. Log in to play

People Involved

Organizations Involved

Timeline

July 2015 June 2026

5 events Latest: Today
Tap a bar to jump to that date
  1. Helix hits 500,000 records and launches AI tools

    Today Milestone

    GenoSphere crosses 500,000 linked records, the largest such dataset in North America. Helix releases the first of a planned suite of AI discovery tools.

  2. Helix crosses 100,000 linked records

    Milestone

    The GenoSphere dataset passes 100,000 linked clinico-genomic records, an early scale benchmark.

  3. Mayo Clinic Tapestry study begins

    Partnership

    Mayo Clinic and Helix announce Tapestry, a plan to sequence the exomes of 100,000 Mayo patients.

  4. Helix pivots from consumer to clinical

    Strategy

    Helix winds down its consumer DNA marketplace and refocuses on clinical sequencing through health-system partners.

  5. Helix launches as Illumina spinoff

    Founding

    Illumina spins out Helix with more than $100 million in initial funding to build a consumer DNA platform.

Historical Context

3 moments from history that rhyme with this story — and how they unfolded.

2006–2010

UK Biobank reaches 500,000 participants (2010)

The UK Biobank, funded by Wellcome Trust, the Medical Research Council, and the NHS, enrolled 500,000 British adults between 2006 and 2010. Each gave blood and urine samples, agreed to genotyping, and consented to long-term linkage with their NHS health records.

Then

Researchers gained access to a population-scale resource linking genotype to clinical outcomes, billed at modest fees for academic use.

Now

UK Biobank has powered thousands of published studies and is the data source behind major pharma deals, including Regeneron's 2014 exome-sequencing partnership.

Why this matters now

Helix is building the equivalent for North America. The 500,000 threshold matters because UK Biobank set it as the scale needed for statistical power on rare variants.

December 2012

Amgen acquires deCODE genetics (2012)

Amgen paid $415 million for Iceland-based deCODE, which held genetic and health data on a large share of Iceland's population. The acquisition was driven by deCODE's ability to validate drug targets using its linked dataset.

Then

The deal set a market price for population-scale linked genetic databases.

Now

Amgen used deCODE data to support development of evolocumab, a PCSK9 inhibitor that became a multi-billion-dollar cardiovascular drug.

Why this matters now

This is the playbook Helix is selling: a linked dataset that lets a pharma buyer skip years of target validation. The deCODE precedent gives pharma a known return on investment.

July 2018

GSK pays $300 million for 23andMe data access (2018)

GlaxoSmithKline paid 23andMe $300 million for a four-year exclusive partnership to use the company's database of consenting customer genotypes for drug discovery.

Then

The deal validated consumer genetic data as a pharma research input and lifted 23andMe's pre-IPO valuation.

Now

GSK reportedly identified roughly 50 drug targets using the data, but 23andMe later struggled financially as consumer DNA demand softened.

Why this matters now

This shows both the upside Helix is chasing and the risk: pharma will pay for linked genetic data, but the dataset operator still needs a sustainable underlying business to keep growing it.

Sources

(2)