Google’s 130,000-Node Kubernetes Cluster: What It Takes

Isometric illustration of massive Kubernetes cluster with thousands of containers

Google's 130,000-node Kubernetes cluster visualization

Google announced last week at KubeCon 2025 that it successfully built and ran a 130,000-node Kubernetes cluster in experimental mode—doubling the official GKE limit of 65,000 nodes. Published November 21, this makes it the largest known Kubernetes cluster in existence. The timing is striking: just two days later, ByteIota covered companies abandoning Kubernetes due to operational complexity. While some organizations ditch K8s, Google just doubled its limits.

Engineering Extreme Scale: Three Critical Breakthroughs

Google had to overcome three major technical challenges to reach 130,000 nodes. First, extreme read amplification: every kubelet constantly hitting the API server creates overwhelming database load at scale. Second, storage backend scalability—standard etcd fundamentally cannot horizontally scale. Third, workload-level scheduling complexity, where pod-level scheduling proves insufficient for AI jobs requiring gang scheduling.

The solutions showcase impressive engineering. Consistent Reads from Cache (KEP-2340) enables the API server to serve strongly consistent data from memory, reducing database load by 3x and cutting CPU usage by 30%. For storage, Google deployed a proprietary Spanner-based distributed system with 20+ shards that achieves 40% better latency than monolithic etcd while handling 13,000 QPS and managing over 1 million objects.

Kueue handles gang scheduling at scale. During Google’s benchmark, it preempted 39,000 pods in 93 seconds to accommodate higher-priority workloads—achieving “almost instantaneous” workload switching. The cluster sustained 1,000 pods per second throughput with pod startup latency under 5 seconds cluster-wide.

Here’s what matters: these innovations aren’t just for mega-clusters. Google’s engineering team notes that “the innovations we developed to reach 130,000 nodes also harden GKE’s core systems for extreme usage, which creates substantial headroom for average clusters.” If you’re running a 1,000-node cluster, you benefit from KEP-2340’s cache improvements without changing anything.

Why Extreme Scale Stops at 100,000 Nodes

Google predicts demand for large clusters will stabilize around 100,000 nodes. The limiting factor has fundamentally shifted: power constraints have replaced chip supply as the primary bottleneck. A single NVIDIA GB200 GPU consumes 2,700 watts. An entire GB200 NVL72 rack requires 120 kilowatts. Scale that to 100,000+ nodes and you need hundreds of megawatts distributed across multiple data centers.

This is what Google calls the “Gigawatt AI era”—where foundational model creators drive demand for unprecedented computational power, but power infrastructure becomes the real constraint. It’s not about how many GPUs you can buy; it’s about how much power you can provision. This explains why Google is investing in MultiKueue for cross-cluster orchestration rather than pushing single-cluster scale to 200,000 or 500,000 nodes. Physics imposes practical limits.

While Companies Ditch Kubernetes, Google Doubles Down

ByteIota published “Kubernetes Exodus: Why Companies Ditch K8s in 2025” on November 23—just two days after Google’s announcement. The tension is fascinating: while many organizations abandon Kubernetes due to operational complexity and overhead, Google demonstrates it can scale to unprecedented levels. Both narratives are true.

The key insight: extreme scale requires massive custom engineering. Google’s 130,000-node cluster runs on proprietary Spanner-based storage, custom API server optimizations, and specialized control plane components that most organizations cannot and will not build. The official GKE limit remains 65,000 nodes. Google’s achievement is experimental, not generally available. Standard Kubernetes hits limits around 5,000-15,000 nodes without custom engineering.

The lesson: assess your actual needs. Don’t build a 130,000-node cluster because Google did. Kubernetes complexity is a real problem for most organizations—hence the exodus. And Kubernetes can scale to extreme levels with hyperscaler-level engineering investment. If your workloads fit comfortably in a 5,000-node cluster, you don’t need Google’s custom solutions. The innovations (KEP-2340, Kueue) are open-source and beneficial across all cluster sizes, but the extreme scale itself isn’t relevant to typical organizations.

Key Takeaways

Kubernetes can scale to 130,000 nodes, but requires massive custom engineering (proprietary storage, API server optimizations) that 99% of organizations won’t undertake.
The innovations developed benefit all cluster sizes—cache improvements and gang scheduling capabilities aren’t exclusive to mega-clusters.
Power constraints now limit AI infrastructure scale more than chip supply, fundamentally changing how organizations plan large deployments.
Most organizations don’t need more than 10,000 nodes—assess your actual requirements rather than following hyperscaler patterns.
The “Kubernetes Exodus” and Google’s achievement are both valid perspectives on the same trade-off: complexity versus capability.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to simplify complex tech concepts, breaking them down into byte-sized and easily digestible information.