Pull to refresh
Logo
Daily Brief
Following
Why
Cloudflare’s 2025 outages expose fragility in internet infrastructure

Cloudflare’s 2025 outages expose fragility in internet infrastructure

Built World
By Newzino Staff | |

Back-to-back global disruptions from 2025 through early 2026—tied to configuration errors, maintenance, routing failures, and latent bugs—raise escalating questions about the web’s centralized backbone

February 4th, 2026: Cloudflare edge instability outage from latent bot mitigation bug

Overview

From November 2025 through February 2026, Cloudflare suffered multiple major global outages, including a three-hour disruption on November 18, 2025, a 25–30 minute WAF misconfiguration incident on December 5, 2025, a BGP routing failure on January 22, 2026, and edge network instability on February 4, 2026. These events temporarily took down services like LinkedIn, Zoom, Shopify, Coinbase, ChatGPT, X, Spotify, and others that rely on Cloudflare’s edge network, content delivery, and security services, affecting an estimated significant portion of global HTTP traffic during peaks.

Incidents trace to internal issues including configuration changes for React2Shell (CVE-2025-55182) mitigations, datacenter maintenance, misconfigured routing policies, and latent bugs exposed by routine updates. With Cloudflare now facing four major outages in under four months, the pattern underscores systemic risks from deep centralization, prompting intensified scrutiny from customers, investors, and analysts on change management, redundancy, and the internet's reliance on few providers.

Key Indicators

4
Major global Cloudflare outages since Nov. 2025
Incidents on Nov. 18 and Dec. 5, 2025; Jan. 22 and Feb. 4, 2026, highlight recurring reliability challenges across proxy, WAF, routing, and edge layers.
≈28%
Share of Cloudflare HTTP traffic disrupted on Dec. 5, 2025
Cloudflare and independent reporting estimate that roughly 25–30% of HTTP traffic on its network failed during the WAF misconfiguration incident, impacting millions of users worldwide.
Few hours each
Duration of Jan. 22 and Feb. 4, 2026 outages
BGP route leak on Jan. 22 and edge instability on Feb. 4 caused widespread access failures for dependent platforms, with recovery after several hours.
81M/s
Cloudflare’s global HTTP request volume
Cloudflare handles on the order of 81 million HTTP requests per second, meaning failures at its edge affect a significant fraction of all web traffic.

Interactive

Exploring all sides of a story is often best achieved with Play.

Ever wondered what historical figures would say about today's headlines?

Sign up to generate historical perspectives on this story.

Sign Up

Debate Arena

Two rounds, two personas, one winner. You set the crossfire.

People Involved

Matthew Prince
Matthew Prince
Co-founder and CEO, Cloudflare (Publicly accountable for repeated outages through early 2026 and ongoing reliability reforms)
Dane Knecht
Dane Knecht
Senior Cloudflare executive (described in some coverage as CTO) (Technical leader addressing recurrent 2026 outages via public apologies and explanations)

Organizations Involved

Cloudflare, Inc.
Cloudflare, Inc.
Internet Infrastructure Company
Status: Core infrastructure provider facing recurrent outages into 2026

Cloudflare is a U.S.-based web infrastructure and website security company that provides content delivery, DDoS protection, DNS, and Web Application Firewall services for millions of websites.

React & React Server Components Ecosystem
React & React Server Components Ecosystem
Open-source software ecosystem
Status: Origin of critical React2Shell vulnerability prompting emergency mitigations

React is a widely used JavaScript library for building web user interfaces. Its Server Components and related frameworks like Next.js underpin a large share of modern web applications.

Major Platforms Dependent on Cloudflare
Major Platforms Dependent on Cloudflare
Customer group
Status: Intermittently disrupted by Cloudflare outages

A wide range of large web platforms—including videoconferencing, social networking, e-commerce, cryptocurrency, gaming, and AI services—rely on Cloudflare’s CDN and security products and were affected by the 2025 outages.

Timeline

  1. Cloudflare edge instability outage from latent bot mitigation bug

    Outage

    A routine configuration change exposed a latent bug in Cloudflare’s bot mitigation service, causing cascading network degradation, 500 errors, and access failures for platforms including ChatGPT, X, Spotify, and others. Services restored after several hours; CTO Dane Knecht apologized, confirming no cyberattack.

  2. Cloudflare status page confirms resolution of edge outage

    Public Statement

    Cloudflare announces fix implementation for the network issue impacting Dashboard, APIs, and traffic; coincides with scheduled maintenance in US datacenters (Detroit, Chicago). Company commits to detailed incident report.

  3. Cloudflare BGP routing failure disrupts global access

    Outage

    Misconfigured automated policy caused BGP route leak, sending traffic via incorrect paths and making applications unreachable despite being online. Affected DNS, routing, and security gateways for millions of services.

  4. Post-incident analyses highlight systemic risk from centralized infrastructure

    Analysis

    Media and security analysts compare Cloudflare’s twin outages to recent disruptions tied to AWS and the 2024 CrowdStrike configuration disaster, arguing that the concentration of traffic and security controls in a small number of providers has created "too big to fail" single points of failure for the global internet. Cloudflare faces renewed scrutiny from customers and investors over whether its internal safeguards and testing practices are adequate for a system of its scale.

  5. Cloudflare outage triggered by WAF body-parsing change

    Outage

    While rolling out changes to how its Web Application Firewall parses HTTP request bodies—intended to better detect React2Shell exploit payloads—Cloudflare introduces a configuration that breaks request processing on older FL1 proxies using its Managed Ruleset. The misconfiguration causes a flood of HTTP 500 errors and briefly knocks out around 25–30% of HTTP traffic on its network, affecting platforms such as Zoom, LinkedIn, Shopify, Coinbase, Claude AI, Fortnite, and others. Engineers quickly identify the issue and roll back the change, restoring service within about 25 minutes.

  6. Cloudflare issues status update and confirms no cyberattack

    Public Statement

    Cloudflare’s status page and social media channels confirm that the incident has been resolved and state that a WAF configuration change—deployed to mitigate an industry-wide React Server Components vulnerability—made its network unavailable for several minutes. The company stresses that the outage was not caused by a cyberattack and promises a detailed blog post and a review of its global configuration systems.

  7. Security community reports active exploitation of React2Shell

    Security Incident

    Within a day of public disclosure, security researchers and cloud providers report that Chinese state-linked threat groups are actively exploiting React2Shell in the wild. Proof-of-concept exploits circulate publicly, pushing vendors and infrastructure providers, including Cloudflare, to accelerate deployment of mitigations.

  8. Emergency WAF rule released for React2Shell vulnerability

    Security Mitigation

    Cloudflare publishes an emergency Web Application Firewall rule update tied to CVE-2025-55182, a critical remote code execution vulnerability in React Server Components known as React2Shell. The rules are designed to block unsafe deserialization patterns linked to the exploit.

  9. Cloudflare issues detailed postmortem and promises reforms

    Public Statement

    In a technical blog post, Cloudflare explains that the November 18 outage was caused by a bug in new proxy code that resulted in unhandled errors and 5xx responses. The company apologizes, acknowledging that it "failed" its customers and the broader internet, and pledges to improve testing, staged rollouts, and safeguards around its configuration systems.

  10. First major 2025 Cloudflare outage disrupts large swath of the web

    Outage

    Cloudflare suffers a widespread outage attributed to a configuration error and unhandled panic in its FL2 Rust-based proxy. The incident lasts several hours and impacts major platforms including X, Spotify, ChatGPT, and popular online games. Cloudflare’s dashboard, login systems, and some internal services are also affected.

Scenarios

1

Cloudflare successfully hardens its change-management and regains trust

Discussed by: Technical press, infrastructure engineers, some investors

In this scenario, Cloudflare treats the November and December outages as existential warning shots and undertakes a comprehensive overhaul of its change-management processes. It expands canary and staged rollouts for all configuration systems, introduces stronger isolation between components on edge servers, and reviews any mechanism (like global configuration killswitches) that can propagate changes fleetwide in seconds. Transparency via detailed postmortems and independent audits reassures large customers, who may still implement multi-CDN strategies but continue to rely heavily on Cloudflare. Outage frequency declines and the events are remembered as catalysts for a more mature operational culture.

2

Large customers diversify away, pressuring Cloudflare’s growth and margins

Discussed by: Enterprise IT leaders, risk consultants, some market analysts

Frustrated by repeated outages and mindful of parallels like the 2024 CrowdStrike update debacle, major enterprises take aggressive steps to reduce single-provider dependence. They adopt multi-CDN architectures, terminate some Cloudflare services in favor of competitors, and build internal fallbacks for DNS and security functions. While Cloudflare remains a major player, its growth slows, sales cycles lengthen due to heightened due diligence on reliability, and margins come under pressure as it must invest more in resilience and offer contractual assurances (such as stricter SLAs or financial guarantees) to retain key accounts.

3

Regulators treat CDNs and security clouds as critical infrastructure

Discussed by: Policy commentators, cybersecurity regulators, some economists

Drawing on the cumulative impact of Cloudflare outages, AWS disruptions, and the global CrowdStrike configuration failure, governments move toward classifying large CDNs and security clouds as critical infrastructure subject to tighter oversight. New rules could require independent reliability audits, mandatory staged rollouts for high-risk changes, incident reporting obligations, and contingency planning to ensure continuity of essential services. Cloudflare faces additional compliance costs but benefits from clearer industry-wide standards that may lock in incumbents and raise barriers to entry.

4

A future emergency patch causes both an outage and a major breach

Discussed by: Security researchers and worst-case risk modelers

A more severe but less likely outcome is that a future critical vulnerability forces another emergency response. In the rush to deploy mitigations, a provider either triggers another large-scale outage or, worse, deploys a flawed rule set that attackers evade, leading to a major breach while stability is also compromised. Such a dual failure could trigger lawsuits, regulatory sanctions, and a rapid loss of market share for the provider involved, and might accelerate a wider industry shift toward fundamentally different architectures for patching and isolation.

5

Industry-wide shift toward safer emergency update patterns

Discussed by: Cloud vendors, security product teams, DevOps/SRE communities

React2Shell, the Cloudflare outages, and the 2024 CrowdStrike incident collectively push the industry to rethink how emergency security updates and configuration changes are tested and deployed. Providers adopt standardized patterns for pre-production fuzzing and chaos experiments, more granular feature flags, and mandatory "slow rollout" paths even under pressure. Open-source ecosystems like React coordinate more closely with infrastructure providers to model exploit traffic and mitigation side effects in advance. The result is fewer internet-scale disruptions when the next critical zero-day hits.

Historical Context

2024 CrowdStrike Configuration Update Outage

2024-07-19

What Happened

On July 19, 2024, cybersecurity firm CrowdStrike pushed a faulty configuration update to its Falcon sensor software for Windows systems. The update caused Windows machines to enter bootloops or crash with "blue screens of death," disrupting more than eight million computers worldwide and affecting airlines, banks, hospitals, broadcasters, and government agencies before a fix was deployed.

Outcome

Short Term

Flights were grounded, services interrupted, and CrowdStrike’s stock price fell sharply. The company worked with customers to manually remediate affected systems, a process that took days for some organizations.

Long Term

CrowdStrike faced lawsuits (including from Delta Air Lines) and regulatory scrutiny but retained much of its customer base. The incident became a case study in how a single misconfigured security update at a dominant provider can have global consequences.

Why It's Relevant Today

Like the 2025 Cloudflare outages, the CrowdStrike incident shows how flawed updates from centralized security infrastructure can ripple across critical services worldwide. Both cases highlight the need for stricter testing, staged deployment, and architectural safeguards around high-privilege security components.

Fastly CDN Outage of June 2021

2021-06-08

What Happened

In June 2021, a configuration issue at content delivery network Fastly triggered a major outage that briefly took down a wide array of high-profile websites, including government portals, news outlets like the BBC, and major e-commerce and streaming services. A single customer configuration change exposed a latent bug in Fastly’s CDN software, causing widespread 503 errors until the change was rolled back.

Outcome

Short Term

Fastly restored service within about an hour and issued a public explanation and apology. Customers experienced temporarily inaccessible sites and lost transactions.

Long Term

The outage underscored the extent to which CDNs had become critical infrastructure and drove more discussion about multi-CDN strategies and the risks of single points of failure.

Why It's Relevant Today

Fastly’s outage is an early, smaller-scale analogue to Cloudflare’s 2025 incidents: both involve edge networks where a configuration change propagates rapidly and unexpectedly breaks large portions of the web. It underlines how subtle bugs in shared infrastructure can have outsized impact.

Facebook Global Outage of October 2021

2021-10-04

What Happened

On October 4, 2021, Facebook and its services—including Instagram, WhatsApp, and Messenger—went offline globally for six to seven hours due to a misconfiguration in its backbone network that withdrew its BGP routes, effectively removing its DNS and services from the internet.

Outcome

Short Term

Billions of users lost access to communication and social platforms; some businesses reliant on WhatsApp and Facebook tools were unable to operate normally during the outage.

Long Term

Facebook revamped aspects of its network change procedures and incident response. The event became a prominent example of how a single configuration error at a large platform can have global social and economic effects.

Why It's Relevant Today

The Facebook outage illustrates the systemic risks of centralized control planes and global configuration updates—similar structural dynamics to Cloudflare’s outages, even though the technologies differ. It reinforces that operational discipline and safe-guarded rollout mechanisms are critical at internet scale.

13 Sources: