The race to build non-Nvidia AI inference chips

Overview

Nvidia sells roughly four out of every five chips running today's large AI models. Investors are now writing nine-figure checks on the bet that the workload coming next, running those models rather than training them, will move to different silicon.

Why it matters

If these chips work, the cost of running frontier AI drops sharply, and Nvidia's near-monopoly on the most valuable workload in computing weakens.

Play on this story Voices Debate Predict

Key Indicators

$220M

Fractile Series B

Round led by Accel, Factorial Funds and Founders Fund, with Conviction, Gigascale, Felicis and 8VC.

25x

Claimed inference speedup

Fractile says its first chip will run frontier models 25 times faster than current GPUs.

~90%

Claimed cost reduction

Fractile targets running frontier inference at roughly one-tenth the cost of GPU-based systems.

2027

Fractile launch year

Capital funds chip tape-out, software stack and early customer integrations ahead of a 2027 product launch.

~80%

Nvidia data-center AI share

Industry estimate of Nvidia's share of AI accelerator revenue — the share challengers are trying to chip away at.

Interactive

Exploring all sides of a story is often best achieved with Play.

Ambrose Bierce

(1842-1914) · Gilded Age · wit

Fictional AI pastiche — not real quote.

"Nine-figure fortunes staked on the certainty that the king of a hill will presently be displaced — this is not investment but rather the ancient ritual of ambitious men paying to watch other ambitious men fail. That four in five chips bear one maker's mark is called a monopoly by the envious and an ecosystem by the beneficiary; that investors now wager two hundred millions on the word "inference" suggests they have mastered the vocabulary of the future without troubling themselves to understand it."

Ever wondered what historical figures would say about today's headlines?

People Involved

Walid Mehri

Just closed $220M Series B; targeting 2027 commercial chip launch

Jonathan Ross

Running one of the most-funded non-Nvidia inference chip companies

Andrew Feldman

Running publicly filed Cerebras; competing on wafer-scale inference

Jensen Huang

Sits atop the AI chip market every challenger is targeting

Organizations Involved

Fractile

AI semiconductor startup

Just raised $220M Series B; targeting 2027 product launch

London-based startup designing chips that run only AI inference, not training.

Accel

Venture capital firm

Lead investor in Fractile Series B

Global venture firm with a long track record in infrastructure and semiconductor bets.

Founders Fund

Venture capital firm

Co-lead investor in Fractile Series B

Peter Thiel's venture firm, an active backer of deep-tech and contrarian infrastructure bets.

Anthropic

AI model developer

Publicly seeking compute diversification beyond Nvidia

Frontier AI lab whose appetite for non-Nvidia inference compute helps drive the funding race.

Nvidia Corporation

Semiconductor company

Incumbent the entire challenger field is trying to displace

Supplier of the GPUs and software stack that run most AI training and inference today.

Timeline

Fractile closes $220M Series B
Funding

London-based Fractile raises $220M led by Accel, Factorial Funds and Founders Fund. Capital funds chip tape-out and software ahead of a 2027 commercial launch targeting 25x faster, ~90% cheaper frontier inference.
Yesterday
Anthropic, Amazon deepen Trainium partnership
Customer deal

Amazon and Anthropic announce an expanded deal that puts more Claude inference on Amazon's Trainium chips. It is the clearest signal yet that frontier labs are willing to move workloads off Nvidia.
November 22nd, 2024
Cerebras files for IPO
Corporate filing

Cerebras Systems files S-1 paperwork to go public, the first major Nvidia challenger to attempt the public markets. The offering is later delayed by regulatory review.
September 30th, 2024
Groq raises $640M at $2.8B valuation
Funding

Groq closes a major growth round led by BlackRock to scale its inference cloud and Language Processing Unit chips.
August 5th, 2024
Etched raises $120M for transformer-only chip
Funding

Etched closes a $120M round to build Sohu, a chip designed to run only transformer-architecture models. The bet: betting on a single architecture buys huge efficiency gains.
June 25th, 2024
Fractile founded in London
Company formation

Walid Mehri co-founds Fractile to build chips designed for AI inference, drawing on neural-network hardware research from Oxford.
January 1st, 2022
Google reveals the TPU at I/O
Technology

Google announces it has been running custom AI silicon, the Tensor Processing Unit, in production. It is the first public proof that purpose-built chips can outperform GPUs on AI workloads.
May 18th, 2016

Scenarios

Predict which scenario wins. Contrarian picks score more — points lock in when the scenario resolves.

Predictions leaderboard

Fractile ships its first commercial chip on schedule

Fractile completes tape-out in 2026, manufactures silicon in 2027, and ships its first chip to at least one paying customer before year-end 2027. The full 25x speed and ~90% cost claims may not survive contact with production, but a credible launch with real customers keeps the company on the trajectory its investors are paying for.

Resolves by: 2027-12-31

Source: Fractile official announcement, Bloomberg or Reuters reporting

Discussed by: Bloomberg, Tech.eu, DatacenterDynamics coverage of the Series B

Consensus —

Major frontier lab signs a nine-figure non-Nvidia inference deal

Anthropic, OpenAI, Google, Meta or another frontier lab signs a publicly disclosed multi-year inference deal worth $100M or more with a non-Nvidia chip company, beyond existing hyperscaler-internal silicon. The Amazon-Anthropic Trainium expansion shows the path; the question is whether a comparable deal lands with one of the startup challengers.

Resolves by: 2026-12-31

Source: Public press release from a named frontier lab or Reuters/Bloomberg confirmation

Discussed by: The Information, Bloomberg, Reuters reporting on AI compute procurement

Consensus —

Hyperscaler acquires a major inference chip startup

AWS, Google, Microsoft, Meta or Oracle buys one of the well-funded inference chip startups to bring its design in-house. Acquirers get a finished design team and IP without paying the IPO premium; the startup gets an exit without proving its commercial model. This pattern repeats from the prior wave of AI hardware deals.

Resolves by: 2026-12-31

Source: Acquirer SEC filings or official press releases

Discussed by: The Information, Bloomberg M&A reporting

Consensus —

Nvidia data-center share holds above 75% through 2026

Despite the funding wave, Nvidia retains more than three-quarters of AI accelerator revenue through calendar 2026. Most challengers are still pre-product or sub-scale, software lock-in slows migration, and Blackwell-generation supply meets the marginal demand. The challenger story remains a 2027-2028 story rather than a 2026 one.

Resolves by: 2027-04-30

Source: IDC or Gartner AI accelerator market share report covering calendar year 2026

Discussed by: Gartner, IDC analyst notes; Bernstein, Morgan Stanley research

Consensus —

At least one well-funded inference startup folds or fire-sells

One of the inference chip companies that has raised $100M or more shuts down, sells for less than total capital raised, or returns capital to investors. Inference silicon is capital-intensive and timing-sensitive; not every funded entrant will reach a commercial chip. A high-profile failure would reset valuations across the category.

Resolves by: 2027-12-31

Source: Company announcement, Reuters/Bloomberg/FT reporting, regulatory filings

Discussed by: The Information, Financial Times venture coverage

Consensus —

Historical Context

Google launches the TPU (2016)

May 2016

What Happened

At Google I/O, Google revealed it had been running a custom AI chip called the Tensor Processing Unit in its data centers since 2015. It was the first time a major operator publicly claimed that purpose-built silicon could beat Nvidia GPUs on AI workloads at scale.

Outcome

Short Term

The TPU opened a credible alternative path for AI compute and validated the thesis that workload-specific chips could compete with general-purpose GPUs.

Long Term

TPUs became central to Google's AI infrastructure and inspired a generation of custom-silicon efforts at Amazon (Trainium, Inferentia), Microsoft (Maia) and Meta (MTIA). It also made the inference chip startup category investable.

Why It's Relevant Today

Every Fractile, Groq and Etched pitch deck traces back to the TPU's central claim: GPUs are not the right shape for AI, and a purpose-built chip can win. The 2026 funding race is the venture-backed extension of that 2016 idea.

The 1990s graphics chip wars

1995-2002

What Happened

Through the late 1990s, a crowded field of graphics chip makers, including 3dfx, ATI, Matrox, S3, Trident and a young Nvidia, fought to define the PC 3D graphics market. Each company pitched a different architecture and a different bet on what gamers and developers would adopt.

Outcome

Short Term

Pricing collapsed, marginal players failed, and the market consolidated faster than investors expected. 3dfx, the early leader, went bankrupt by 2002.

Long Term

Two winners, Nvidia and ATI (later AMD), emerged with durable share. The lesson: in chip categories with high R&D costs and software lock-in, late-cycle consolidation is brutal and most well-funded entrants do not survive.

Why It's Relevant Today

Today's inference chip field looks structurally similar: many funded entrants, competing architectures, no clear winner, and a software moat held by the incumbent. The 1990s suggest the next five years will end with two or three survivors, not ten.

Amazon launches AWS Inferentia (2019)

November 2019

What Happened

Amazon unveiled Inferentia, a custom inference chip built in-house for AWS, and later added Trainium for training. The chips were aimed at lowering Amazon's own compute costs and offering customers a cheaper alternative to Nvidia inside AWS.

Outcome

Short Term

Inferentia gained limited adoption initially as customers stuck with familiar GPU tooling.

Long Term

By the mid-2020s, Trainium and Inferentia were central to AWS's AI pitch and underpinned the deeper Amazon-Anthropic partnership announced in 2024. Custom hyperscaler silicon proved viable at scale.

Why It's Relevant Today

Hyperscalers can and do build their own inference chips. That sets a ceiling on how much of the inference market the independent startups can address: the biggest buyers may bring the workload in-house rather than buy from Fractile or Groq.

Sources

(4)

The race to build non-Nvidia AI inference chips

Overview

Key Indicators

Related Media

Interactive

Ambrose Bierce

Ever wondered what historical figures would say about today's headlines?

Preview Voice

Generating Voice

Voice published

Choose a Historical Figure

Albert Einstein

Ambrose Bierce

Andrew Carnegie

Andrew Mellon

Ayn Rand

Benjamin Franklin

Cecil Rhodes

Charles Darwin

Cornelius Vanderbilt

Dorothy Parker

Eleanor Roosevelt

Frederick Douglass

G. K. Chesterton

George Orwell

H. L. Mencken

Hannah Arendt

J. P. Morgan

James Baldwin

Jamsetji Tata

Jane Addams

John Locke

Jonathan Swift

Madam C. J. Walker

Mark Twain

Mary Wollstonecraft

Niccolo Machiavelli

Oscar Wilde

Rachel Carson

Samuel Johnson

Simone Weil

Sojourner Truth

Theodore Roosevelt

Thomas Hobbes

Thomas Jefferson

Thomas Paine

Voltaire

Winston Churchill

People Involved

Organizations Involved

Timeline

Fractile closes $220M Series B

Anthropic, Amazon deepen Trainium partnership

Cerebras files for IPO

Groq raises $640M at $2.8B valuation

Etched raises $120M for transformer-only chip

Fractile founded in London

Google reveals the TPU at I/O

Scenarios

Fractile ships its first commercial chip on schedule

Major frontier lab signs a nine-figure non-Nvidia inference deal

Hyperscaler acquires a major inference chip startup

Nvidia data-center share holds above 75% through 2026

At least one well-funded inference startup folds or fire-sells

Historical Context

Google launches the TPU (2016)

What Happened

Outcome

Why It's Relevant Today

The 1990s graphics chip wars

What Happened

Outcome

Why It's Relevant Today

Amazon launches AWS Inferentia (2019)

What Happened

Outcome

Why It's Relevant Today

Sources

Related Stories

Nvidia builds an AI empire through billion-dollar ecosystem investments