Pull to refresh
Logo
Daily Brief
Following
Why Ranks Sign Up
Nvidia's generational GPU leaps reshape who controls AI infrastructure

Nvidia's generational GPU leaps reshape who controls AI infrastructure

New Capabilities

From Hopper to Vera Rubin, each architecture cycle compresses timelines, raises performance floors, and forces competitors to redesign their strategies

March 16th, 2026: Vera Rubin, Rubin CPX, and NemoClaw launched at GTC 2026

Overview

Nvidia has spent four years on an annual architecture cadence that no semiconductor company has sustained before. At GTC 2026, Jensen Huang unveiled Vera Rubin (a single-GPU system delivering 50 petaflops, five times Blackwell's performance, at one-tenth the cost per token) and NemoClaw, an open-source platform letting companies deploy autonomous AI agents without cloud-provider lock-in.

The announcements show a strategic shift: as Google, Amazon, Microsoft, and OpenAI design custom chips to reduce Nvidia dependence, Huang is layering software platforms on top of hardware to make Nvidia's ecosystem harder to leave. For three years, Nvidia has dominated AI hardware with roughly 85 percent of the accelerator market. Vera Rubin ships in the second half of 2026, with a gigawatt-scale deployment deal with Mira Murati's Thinking Machines Lab signaling where the first systems will land.

Questions about this story

No questions yet — be the first to ask.

Play on this story Voices Debate Predict

Key Indicators

50 PFLOPS
Rubin GPU inference performance
Each Rubin GPU delivers 50 petaflops of NVFP4 inference, a 3.3x to 5x improvement over Blackwell Ultra.
10x
Inference cost reduction claimed
Nvidia claims a tenfold reduction in per-token inference cost for mixture-of-experts models compared to Blackwell.
~85%
Nvidia's AI accelerator market share
Nvidia holds approximately 85 percent of the AI GPU market by revenue, though custom chips from hyperscalers are growing faster.
$215.9B
Nvidia fiscal 2026 revenue
Full-year revenue for fiscal year ending January 2026, up 65 percent year-over-year, driven almost entirely by data center sales.
1 GW
Thinking Machines Lab deployment
Nvidia committed at least one gigawatt of Vera Rubin systems to Mira Murati's AI startup under a multiyear partnership.

Voices

Curated perspectives — historical figures and your fellow readers.

Benjamin Franklin

Benjamin Franklin

(1706-1790) · Enlightenment · wit

Fictional AI pastiche — not real quote.

"A man who sells both the lightning rod and the thunder hath discovered the surest path to prosperity; and I observe that Mr. Huang, having first made himself indispensable to those who would harness the electric mind, now wisely ensures that even those who build their own rods must still buy his clouds."

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Play

Exploring all sides of a story is often best achieved with Play.

Log in to play. Track your picks, climb the leaderboards. Log in Sign Up
Predict 4 ways this could play out. Contrarian picks score more — points lock when the scenario resolves. Log in to play
Timeline Five events from this story — drag them oldest to newest. Log in to play
Connections Sixteen names from the news. Find the four hidden groups of four. Log in to play

People Involved

Organizations Involved

Timeline

March 2022 March 2026

11 events Latest: March 16th, 2026 · 3 months ago Showing 8 of 11
Tap a bar to jump to that date
  1. Vera Rubin, Rubin CPX, and NemoClaw launched at GTC 2026

    Latest Product Launch

    Jensen Huang's keynote to 39,000 attendees officially debuted six new chips including the Rubin GPU (50 petaflops inference), Rubin CPX for massive-context processing, and NemoClaw—an open-source enterprise AI agent platform that works across any hardware vendor.

  2. Nvidia and Thinking Machines Lab announce gigawatt partnership

    Partnership

    Nvidia committed at least one gigawatt of Vera Rubin systems to Thinking Machines Lab in a multiyear deal, alongside a significant equity investment in the startup.

  3. Nvidia reports $215.9 billion in fiscal 2026 revenue

    Financial

    Data center revenue reached $62.3 billion in the fourth quarter alone, up 75 percent year-over-year. The company guided for $78 billion in Q1 fiscal 2027.

  4. Thinking Machines Lab loses co-founders to OpenAI

    Personnel

    Co-founders Brett Zoph and Luke Metz returned to OpenAI, along with other senior researchers, creating a talent retention challenge for Murati's startup.

  5. Huang confirms Vera Rubin in full production at CES

    Production Milestone

    Nvidia announced that Vera Rubin NVL72 systems had entered full production, with specs upgraded to 288GB of HBM4 per GPU and 22 TB/s memory bandwidth—a 70 percent leap from earlier figures.

  6. Thinking Machines Lab raises $2 billion seed round

    Funding

    The startup closed one of the largest seed rounds in history at a $12 billion valuation, with Andreessen Horowitz, Accel, Nvidia, and AMD participating.

  7. Nvidia reveals Vera Rubin and Feynman roadmap at GTC 2025

    Roadmap

    Huang disclosed the Vera Rubin platform for 2026 and the Feynman architecture for 2028, committing to an annual cadence of generational GPU leaps.

  8. Thinking Machines Lab publicly launches

    Company Launch

    Murati announced a public benefit corporation focused on collaborative AI, quickly attracting investor interest from major venture firms and chipmakers.

  9. Mira Murati departs OpenAI

    Personnel

    OpenAI's chief technology officer left to pursue independent work, later founding Thinking Machines Lab.

  10. Blackwell architecture debuts at GTC 2024

    Product Launch

    Jensen Huang launched the Blackwell GPU with 208 billion transistors across two chiplets, doubling NVLink bandwidth and adding native support for sub-8-bit data types. Nvidia's first multi-chip GPU design.

  11. Nvidia announces Hopper architecture

    Product Launch

    Nvidia unveiled the H100 GPU based on the Hopper architecture, introducing FP8 precision and NVLink 4.0. Demand for H100s quickly outstripped supply, with lead times stretching past a year.

Historical Context

3 moments from history that rhyme with this story — and how they unfolded.

2001-2012

Intel's Itanium and the x86 disruption that wasn't (2001-2012)

Intel launched Itanium in 2001 as a clean-break replacement for x86 processors, expecting enterprise customers to migrate to the new architecture. AMD responded with x86-64, extending the existing architecture to handle 64-bit workloads without requiring customers to rewrite their software. Itanium ultimately sold fewer than one million units while x86-64 became the industry standard.

Then

Intel spent billions developing Itanium while AMD captured server market share by maintaining backward compatibility.

Now

The episode demonstrated that software ecosystem lock-in matters more than raw hardware performance—customers chose the architecture that preserved their existing code investment.

Why this matters now

Nvidia's CUDA ecosystem and now NemoClaw follow the same logic: performance leadership matters, but the software layer that keeps developers from switching may be the more durable competitive advantage against custom ASICs.

2017-present

Qualcomm's baseband modem dominance and Apple's breakaway attempt (2017-present)

Apple spent years trying to replace Qualcomm's modem chips with in-house designs, partnering with Intel and later developing its own cellular modems. Despite multi-billion-dollar investment, Apple repeatedly delayed its custom modem rollout due to the difficulty of matching Qualcomm's integrated performance across cellular standards. Qualcomm maintained over 50 percent market share throughout.

Then

Apple's Intel-sourced modems underperformed, and the company settled a bitter patent dispute with Qualcomm in 2019.

Now

The case showed that even the world's most valuable company found it extremely difficult to replicate a dominant chipmaker's performance—though Apple eventually shipped its first in-house modem in 2025.

Why this matters now

Hyperscalers building custom AI chips face a similar challenge: Nvidia's integrated hardware-software stack is difficult to replicate, but well-resourced companies with enough volume may eventually succeed at narrower workloads like inference.

2006-2010

Amazon Web Services creates the cloud computing market (2006-2010)

Amazon launched Elastic Compute Cloud in 2006, offering commodity server access on demand. Traditional hardware vendors like Sun Microsystems and HP initially dismissed the model, arguing enterprises would always want to own their own iron. By 2010, AWS had established a platform ecosystem—storage, databases, networking—that made switching costs high even though the underlying hardware was generic.

Then

AWS grew from zero to $1 billion in revenue in roughly four years while incumbents scrambled to respond.

Now

The platform layer, not the hardware, became the defensible business. Sun Microsystems was acquired by Oracle in 2010 for a fraction of its peak value.

Why this matters now

NemoClaw signals that Nvidia is learning from the cloud playbook: hardware cycles come and go, but the platform layer that enterprises build their workflows on creates switching costs that outlast any single chip generation.

Sources

(17)