Pull to refresh
Logo
Daily Brief
Following
Why Ranks Sign Up
Cloudflare’s 2025 outages expose fragility in internet infrastructure

Cloudflare’s 2025 outages expose fragility in internet infrastructure

Built World

Back-to-back global disruptions from 2025 through early 2026—tied to configuration errors, maintenance, routing failures, and latent bugs—raise questions about the web's centralized backbone

February 4th, 2026: Cloudflare edge instability outage from latent bot mitigation bug

Overview

From November 2025 through February 2026, Cloudflare suffered multiple major global outages, including a three-hour disruption on November 18, 2025, a 25–30 minute WAF misconfiguration incident on December 5, 2025, a BGP routing failure on January 22, 2026, and edge network instability on February 4, 2026. These events temporarily took down services like LinkedIn, Zoom, Shopify, Coinbase, ChatGPT, X, Spotify, and others that rely on Cloudflare's edge network, content delivery, and security services, affecting a significant portion of global HTTP traffic during peaks.

Incidents trace to internal issues including configuration changes for React2Shell (CVE-2025-55182) mitigations, datacenter maintenance, misconfigured routing policies, and latent bugs exposed by routine updates. With four major outages in four months, customers, investors, and analysts are scrutinizing Cloudflare's change management and redundancy—a pattern that shows the internet's dependence on a few large providers.

Key Indicators

4
Major global Cloudflare outages since Nov. 2025
Incidents on Nov. 18 and Dec. 5, 2025; Jan. 22 and Feb. 4, 2026, highlight recurring reliability challenges across proxy, WAF, routing, and edge layers.
≈28%
Share of Cloudflare HTTP traffic disrupted on Dec. 5, 2025
Cloudflare and independent reporting estimate that roughly 25–30% of HTTP traffic on its network failed during the WAF misconfiguration incident, impacting millions of users worldwide.
Few hours each
Duration of Jan. 22 and Feb. 4, 2026 outages
BGP route leak on Jan. 22 and edge instability on Feb. 4 caused widespread access failures for dependent platforms, with recovery after several hours.
81M/s
Cloudflare’s global HTTP request volume
Cloudflare handles on the order of 81 million HTTP requests per second, meaning failures at its edge affect a significant fraction of all web traffic.

Voices

Curated perspectives — historical figures and your fellow readers.

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Play

Exploring all sides of a story is often best achieved with Play.

Log in to play. Track your picks, climb the leaderboards. Log in Sign Up
Predict 5 ways this could play out. Contrarian picks score more — points lock when the scenario resolves. Log in to play
Timeline Five events from this story — drag them oldest to newest. Log in to play
Connections Sixteen names from the news. Find the four hidden groups of four. Log in to play

People Involved

Organizations Involved

Timeline

November 2025 February 2026

10 events Latest: February 4th, 2026 · 4 months ago
Tap a bar to jump to that date
  1. Cloudflare edge instability outage from latent bot mitigation bug

    Latest Outage

    A routine configuration change exposed a latent bug in Cloudflare’s bot mitigation service, causing cascading network degradation, 500 errors, and access failures for platforms including ChatGPT, X, Spotify, and others. Services restored after several hours; CTO Dane Knecht apologized, confirming no cyberattack.

  2. Cloudflare status page confirms resolution of edge outage

    Public Statement

    Cloudflare announces fix implementation for the network issue impacting Dashboard, APIs, and traffic; coincides with scheduled maintenance in US datacenters (Detroit, Chicago). Company commits to detailed incident report.

  3. Cloudflare BGP routing failure disrupts global access

    Outage

    Misconfigured automated policy caused BGP route leak, sending traffic via incorrect paths and making applications unreachable despite being online. Affected DNS, routing, and security gateways for millions of services.

  4. Post-incident analyses highlight systemic risk from centralized infrastructure

    Analysis

    Media and security analysts compare Cloudflare’s twin outages to recent disruptions tied to AWS and the 2024 CrowdStrike configuration disaster, arguing that the concentration of traffic and security controls in a small number of providers has created "too big to fail" single points of failure for the global internet. Cloudflare faces renewed scrutiny from customers and investors over whether its internal safeguards and testing practices are adequate for a system of its scale.

  5. Cloudflare outage triggered by WAF body-parsing change

    Outage

    While rolling out changes to how its Web Application Firewall parses HTTP request bodies—intended to better detect React2Shell exploit payloads—Cloudflare introduces a configuration that breaks request processing on older FL1 proxies using its Managed Ruleset. The misconfiguration causes a flood of HTTP 500 errors and briefly knocks out around 25–30% of HTTP traffic on its network, affecting platforms such as Zoom, LinkedIn, Shopify, Coinbase, Claude AI, Fortnite, and others. Engineers quickly identify the issue and roll back the change, restoring service within about 25 minutes.

  6. Cloudflare issues status update and confirms no cyberattack

    Public Statement

    Cloudflare’s status page and social media channels confirm that the incident has been resolved and state that a WAF configuration change—deployed to mitigate an industry-wide React Server Components vulnerability—made its network unavailable for several minutes. The company stresses that the outage was not caused by a cyberattack and promises a detailed blog post and a review of its global configuration systems.

  7. Security community reports active exploitation of React2Shell

    Security Incident

    Within a day of public disclosure, security researchers and cloud providers report that Chinese state-linked threat groups are actively exploiting React2Shell in the wild. Proof-of-concept exploits circulate publicly, pushing vendors and infrastructure providers, including Cloudflare, to accelerate deployment of mitigations.

  8. Emergency WAF rule released for React2Shell vulnerability

    Security Mitigation

    Cloudflare publishes an emergency Web Application Firewall rule update tied to CVE-2025-55182, a critical remote code execution vulnerability in React Server Components known as React2Shell. The rules are designed to block unsafe deserialization patterns linked to the exploit.

  9. Cloudflare issues detailed postmortem and promises reforms

    Public Statement

    In a technical blog post, Cloudflare explains that the November 18 outage was caused by a bug in new proxy code that resulted in unhandled errors and 5xx responses. The company apologizes, acknowledging that it "failed" its customers and the broader internet, and pledges to improve testing, staged rollouts, and safeguards around its configuration systems.

  10. First major 2025 Cloudflare outage disrupts large swath of the web

    Outage

    Cloudflare suffers a widespread outage attributed to a configuration error and unhandled panic in its FL2 Rust-based proxy. The incident lasts several hours and impacts major platforms including X, Spotify, ChatGPT, and popular online games. Cloudflare’s dashboard, login systems, and some internal services are also affected.

Historical Context

3 moments from history that rhyme with this story — and how they unfolded.

2024-07-19

2024 CrowdStrike Configuration Update Outage

On July 19, 2024, cybersecurity firm CrowdStrike pushed a faulty configuration update to its Falcon sensor software for Windows systems. The update caused Windows machines to enter bootloops or crash with "blue screens of death," disrupting more than eight million computers worldwide and affecting airlines, banks, hospitals, broadcasters, and government agencies before a fix was deployed.

Then

Flights were grounded, services interrupted, and CrowdStrike’s stock price fell sharply. The company worked with customers to manually remediate affected systems, a process that took days for some organizations.

Now

CrowdStrike faced lawsuits (including from Delta Air Lines) and regulatory scrutiny but retained much of its customer base. The incident became a case study in how a single misconfigured security update at a dominant provider can have global consequences.

Why this matters now

Like the 2025 Cloudflare outages, the CrowdStrike incident shows how flawed updates from centralized security infrastructure can ripple across critical services worldwide. Both cases highlight the need for stricter testing, staged deployment, and architectural safeguards around high-privilege security components.

2021-06-08

Fastly CDN Outage of June 2021

In June 2021, a configuration issue at content delivery network Fastly triggered a major outage that briefly took down a wide array of high-profile websites, including government portals, news outlets like the BBC, and major e-commerce and streaming services. A single customer configuration change exposed a latent bug in Fastly’s CDN software, causing widespread 503 errors until the change was rolled back.

Then

Fastly restored service within about an hour and issued a public explanation and apology. Customers experienced temporarily inaccessible sites and lost transactions.

Now

The outage underscored the extent to which CDNs had become critical infrastructure and drove more discussion about multi-CDN strategies and the risks of single points of failure.

Why this matters now

Fastly’s outage is an early, smaller-scale analogue to Cloudflare’s 2025 incidents: both involve edge networks where a configuration change propagates rapidly and unexpectedly breaks large portions of the web. It underlines how subtle bugs in shared infrastructure can have outsized impact.

2021-10-04

Facebook Global Outage of October 2021

On October 4, 2021, Facebook and its services—including Instagram, WhatsApp, and Messenger—went offline globally for six to seven hours due to a misconfiguration in its backbone network that withdrew its BGP routes, effectively removing its DNS and services from the internet.

Then

Billions of users lost access to communication and social platforms; some businesses reliant on WhatsApp and Facebook tools were unable to operate normally during the outage.

Now

Facebook revamped aspects of its network change procedures and incident response. The event became a prominent example of how a single configuration error at a large platform can have global social and economic effects.

Why this matters now

The Facebook outage illustrates the systemic risks of centralized control planes and global configuration updates—similar structural dynamics to Cloudflare’s outages, even though the technologies differ. It reinforces that operational discipline and safe-guarded rollout mechanisms are critical at internet scale.

Sources

(13)