Pull to refresh
Logo
Daily Brief
Following
Why Ranks Sign Up
The race to build non-Nvidia AI inference chips

The race to build non-Nvidia AI inference chips

Money Moves

A wave of startups raises billions on bets that custom silicon can run frontier AI models faster and cheaper than GPUs

Yesterday: Fractile closes $220M Series B

Overview

Nvidia sells roughly four out of every five chips running today's large AI models. Investors are now writing nine-figure checks on the bet that the workload coming next, running those models rather than training them, will move to different silicon.

Why it matters

If these chips work, the cost of running frontier AI drops sharply, and Nvidia's near-monopoly on the most valuable workload in computing weakens.

Play on this story Voices Debate Predict

Key Indicators

$220M
Fractile Series B
Round led by Accel, Factorial Funds and Founders Fund, with Conviction, Gigascale, Felicis and 8VC.
25x
Claimed inference speedup
Fractile says its first chip will run frontier models 25 times faster than current GPUs.
~90%
Claimed cost reduction
Fractile targets running frontier inference at roughly one-tenth the cost of GPU-based systems.
2027
Fractile launch year
Capital funds chip tape-out, software stack and early customer integrations ahead of a 2027 product launch.
~80%
Nvidia data-center AI share
Industry estimate of Nvidia's share of AI accelerator revenue — the share challengers are trying to chip away at.

Interactive

Exploring all sides of a story is often best achieved with Play.

Ambrose Bierce

Ambrose Bierce

(1842-1914) · Gilded Age · wit

Fictional AI pastiche — not real quote.

"Nine-figure fortunes staked on the certainty that the king of a hill will presently be displaced — this is not investment but rather the ancient ritual of ambitious men paying to watch other ambitious men fail. That four in five chips bear one maker's mark is called a monopoly by the envious and an ecosystem by the beneficiary; that investors now wager two hundred millions on the word "inference" suggests they have mastered the vocabulary of the future without troubling themselves to understand it."

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

People Involved

Organizations Involved

Timeline

  1. Fractile closes $220M Series B

    Funding

    London-based Fractile raises $220M led by Accel, Factorial Funds and Founders Fund. Capital funds chip tape-out and software ahead of a 2027 commercial launch targeting 25x faster, ~90% cheaper frontier inference.

  2. Anthropic, Amazon deepen Trainium partnership

    Customer deal

    Amazon and Anthropic announce an expanded deal that puts more Claude inference on Amazon's Trainium chips. It is the clearest signal yet that frontier labs are willing to move workloads off Nvidia.

  3. Cerebras files for IPO

    Corporate filing

    Cerebras Systems files S-1 paperwork to go public, the first major Nvidia challenger to attempt the public markets. The offering is later delayed by regulatory review.

  4. Groq raises $640M at $2.8B valuation

    Funding

    Groq closes a major growth round led by BlackRock to scale its inference cloud and Language Processing Unit chips.

  5. Etched raises $120M for transformer-only chip

    Funding

    Etched closes a $120M round to build Sohu, a chip designed to run only transformer-architecture models. The bet: betting on a single architecture buys huge efficiency gains.

  6. Fractile founded in London

    Company formation

    Walid Mehri co-founds Fractile to build chips designed for AI inference, drawing on neural-network hardware research from Oxford.

  7. Google reveals the TPU at I/O

    Technology

    Google announces it has been running custom AI silicon, the Tensor Processing Unit, in production. It is the first public proof that purpose-built chips can outperform GPUs on AI workloads.

Scenarios

Predict which scenario wins. Contrarian picks score more — points lock in when the scenario resolves.

Log in to predict. Track your picks, climb the leaderboard. Log in Sign Up
1

Fractile ships its first commercial chip on schedule

Fractile completes tape-out in 2026, manufactures silicon in 2027, and ships its first chip to at least one paying customer before year-end 2027. The full 25x speed and ~90% cost claims may not survive contact with production, but a credible launch with real customers keeps the company on the trajectory its investors are paying for.

Resolves by: 2027-12-31
Source: Fractile official announcement, Bloomberg or Reuters reporting
Discussed by: Bloomberg, Tech.eu, DatacenterDynamics coverage of the Series B
Consensus
2

Major frontier lab signs a nine-figure non-Nvidia inference deal

Anthropic, OpenAI, Google, Meta or another frontier lab signs a publicly disclosed multi-year inference deal worth $100M or more with a non-Nvidia chip company, beyond existing hyperscaler-internal silicon. The Amazon-Anthropic Trainium expansion shows the path; the question is whether a comparable deal lands with one of the startup challengers.

Resolves by: 2026-12-31
Source: Public press release from a named frontier lab or Reuters/Bloomberg confirmation
Discussed by: The Information, Bloomberg, Reuters reporting on AI compute procurement
Consensus
3

Hyperscaler acquires a major inference chip startup

AWS, Google, Microsoft, Meta or Oracle buys one of the well-funded inference chip startups to bring its design in-house. Acquirers get a finished design team and IP without paying the IPO premium; the startup gets an exit without proving its commercial model. This pattern repeats from the prior wave of AI hardware deals.

Resolves by: 2026-12-31
Source: Acquirer SEC filings or official press releases
Discussed by: The Information, Bloomberg M&A reporting
Consensus
4

Nvidia data-center share holds above 75% through 2026

Despite the funding wave, Nvidia retains more than three-quarters of AI accelerator revenue through calendar 2026. Most challengers are still pre-product or sub-scale, software lock-in slows migration, and Blackwell-generation supply meets the marginal demand. The challenger story remains a 2027-2028 story rather than a 2026 one.

Resolves by: 2027-04-30
Source: IDC or Gartner AI accelerator market share report covering calendar year 2026
Discussed by: Gartner, IDC analyst notes; Bernstein, Morgan Stanley research
Consensus
5

At least one well-funded inference startup folds or fire-sells

One of the inference chip companies that has raised $100M or more shuts down, sells for less than total capital raised, or returns capital to investors. Inference silicon is capital-intensive and timing-sensitive; not every funded entrant will reach a commercial chip. A high-profile failure would reset valuations across the category.

Resolves by: 2027-12-31
Source: Company announcement, Reuters/Bloomberg/FT reporting, regulatory filings
Discussed by: The Information, Financial Times venture coverage
Consensus

Historical Context

Google launches the TPU (2016)

May 2016

What Happened

At Google I/O, Google revealed it had been running a custom AI chip called the Tensor Processing Unit in its data centers since 2015. It was the first time a major operator publicly claimed that purpose-built silicon could beat Nvidia GPUs on AI workloads at scale.

Outcome

Short Term

The TPU opened a credible alternative path for AI compute and validated the thesis that workload-specific chips could compete with general-purpose GPUs.

Long Term

TPUs became central to Google's AI infrastructure and inspired a generation of custom-silicon efforts at Amazon (Trainium, Inferentia), Microsoft (Maia) and Meta (MTIA). It also made the inference chip startup category investable.

Why It's Relevant Today

Every Fractile, Groq and Etched pitch deck traces back to the TPU's central claim: GPUs are not the right shape for AI, and a purpose-built chip can win. The 2026 funding race is the venture-backed extension of that 2016 idea.

The 1990s graphics chip wars

1995-2002

What Happened

Through the late 1990s, a crowded field of graphics chip makers, including 3dfx, ATI, Matrox, S3, Trident and a young Nvidia, fought to define the PC 3D graphics market. Each company pitched a different architecture and a different bet on what gamers and developers would adopt.

Outcome

Short Term

Pricing collapsed, marginal players failed, and the market consolidated faster than investors expected. 3dfx, the early leader, went bankrupt by 2002.

Long Term

Two winners, Nvidia and ATI (later AMD), emerged with durable share. The lesson: in chip categories with high R&D costs and software lock-in, late-cycle consolidation is brutal and most well-funded entrants do not survive.

Why It's Relevant Today

Today's inference chip field looks structurally similar: many funded entrants, competing architectures, no clear winner, and a software moat held by the incumbent. The 1990s suggest the next five years will end with two or three survivors, not ten.

Amazon launches AWS Inferentia (2019)

November 2019

What Happened

Amazon unveiled Inferentia, a custom inference chip built in-house for AWS, and later added Trainium for training. The chips were aimed at lowering Amazon's own compute costs and offering customers a cheaper alternative to Nvidia inside AWS.

Outcome

Short Term

Inferentia gained limited adoption initially as customers stuck with familiar GPU tooling.

Long Term

By the mid-2020s, Trainium and Inferentia were central to AWS's AI pitch and underpinned the deeper Amazon-Anthropic partnership announced in 2024. Custom hyperscaler silicon proved viable at scale.

Why It's Relevant Today

Hyperscalers can and do build their own inference chips. That sets a ceiling on how much of the inference market the independent startups can address: the biggest buyers may bring the workload in-house rather than buy from Fractile or Groq.

Sources

(4)