Go’s Green Tea GC: Real-World Results Are Mixed

Go gopher mascot in front of a server metrics dashboard showing Green Tea GC memory RSS spike compared to classic GC

Go 1.26 Green Tea GC: community benchmarks show mixed real-world results

Four months after Go 1.26 shipped Green Tea as the default garbage collector, community benchmarks are telling a different story than the launch headlines promised. The Go team’s 10–40% GC overhead reduction is real — but only for the right workloads. Teams running production traffic are finding that some services got faster, some stayed the same, and some got slower. A BigGo report published today aggregates those field findings, and the picture is more complicated than the February release notes suggested.

Green Tea’s Hidden Trade-off: Memory for CPU

Green Tea works by running GC cycles less frequently. Instead of the existing algorithm’s pointer-chasing pattern — which spends more than 35% of its CPU cycles stalled on cache misses — Green Tea processes memory in contiguous spans, dramatically improving locality. The result: fewer GC cycles, less total GC CPU time. The Go team’s official write-up explains the algorithm in detail.

What the launch coverage underplayed: when Green Tea does run a cycle, it runs longer and uses more CPU per cycle. The net effect shows up in memory. Teams upgrading to Go 1.26 are reporting average RSS increases of 8–15%. Some workloads are reporting a 4x jump — one commonly cited example goes from 7.8 MB to 33 MB. For containerized deployments without explicit memory limits, this can translate directly to OOM events after what looked like a routine runtime upgrade.

The fix is straightforward but easy to miss: set GOMEMLIMIT. Without an explicit memory ceiling, Green Tea cannot optimize its cycle spacing to fit your container’s constraints. If you upgraded to Go 1.26 without setting this, check your RSS trends before they become a 3 AM incident.

The DoltHub Regression: A Case Study in Algorithm Mismatch

DoltHub — makers of Dolt, a version-controlled SQL database written in Go — ran Green Tea experimentally and documented a clear regression. Their detailed case study found noticeably elevated mark CPU time with no corresponding improvement in latency. The conclusion was blunt: “The Green Tea collector doesn’t make any difference in real-world performance numbers” for their workload. They chose not to enable it in production.

The reason explains the limits of the algorithm. Green Tea’s locality improvement depends on related objects being allocated near each other in memory. Database-style workloads — with low-fanout object graphs that mutate frequently — don’t have that locality. Green Tea still processes the heap span-by-span, but when there are few related objects per span, it does extra bookkeeping work without finding many objects to batch. The algorithm’s strength becomes overhead.

The same issue surfaces in other patterns: single-core or dual-core deployments where concurrent marking parallelism doesn’t matter, low-allocation batch jobs where GC overhead was already minimal, and memory-constrained containers where the RSS increase causes more harm than the CPU savings provide.

Who Gets the Wins

The 10–40% headline is achievable — with the right service profile. REST API handlers allocating per-request structs, streaming pipelines processing high-volume small messages, and high-throughput brokers with continuous short-lived allocations are where Green Tea performs as advertised. These workloads create many small objects in tight loops, and that locality is exactly what Green Tea’s span-based marking was built for. The gains also compound with core count: Green Tea scales better across parallel cores, so a 16-core production server benefits more than a 2-core CI runner.

If your service’s GC overhead is already below 5% of total CPU time, the gains will be negligible regardless of workload type.

What to Do Before Go 1.27 Ships

The opt-out flag — GOEXPERIMENT=nogreenteagc — is being removed in Go 1.27, expected in August 2026. That leaves roughly two months to make any necessary changes. Three things are worth doing now.

First, benchmark your own workload. Do not rely on the headline numbers:

# Baseline: classic GC
GOEXPERIMENT=nogreenteagc go build -o app-classic ./...

# Default Go 1.26: Green Tea
go build -o app-greentea ./...

# Compare GC behavior at runtime
GODEBUG=gctrace=1 ./app-classic
GODEBUG=gctrace=1 ./app-greentea

Watch gc_cpu_fraction from runtime.ReadMemStats and compare pause duration histograms between the two builds.

Second, if you find a regression, report it to the Go team. GitHub issue #73581 is still open. The window to influence what lands in Go 1.27 is closing.

Third, set GOMEMLIMIT if you haven’t. This applies regardless of whether you’re seeing obvious regressions — it’s now a hygiene requirement when running Go 1.26 in containerized environments.

The Bottom Line

The February launch narrative — “Green Tea makes Go 40% faster in GC” — was never wrong, exactly. It was just incomplete. The 40% is real for services that match the algorithm’s assumptions. The problem is that a lot of production Go code doesn’t. Don’t assume upgrading to 1.26 made your service faster. Run the benchmark. Check your RSS. If you’re running a database, a low-allocation service, or anything in a memory-constrained container, the default is now working against you until you set GOMEMLIMIT — or until the Go team tightens the algorithm further before 1.27.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.

Go’s Green Tea GC: Real-World Results Are Mixed

Green Tea’s Hidden Trade-off: Memory for CPU

The DoltHub Regression: A Case Study in Algorithm Mismatch

Who Gets the Wins

What to Do Before Go 1.27 Ships

The Bottom Line

Local LLMs vs Claude for Coding: The 70% Problem

Amazon Linux 2 EOL June 30: Migrate Your Lambda Functions Now

Leave a reply Cancel reply

More in:News

Copilot Cloud Agent for Linear Is Now GA: Setup and Limits

Supabase Self-Hosted Switches to Envoy: Prepare Before August 9

Bun Rust Rewrite: 6 Weeks Later, 2,475 PRs Still Open

AI Is Buying Rare Books and Shredding Them. Here’s Why

Scriptc: Vercel’s TypeScript Compiles to Native in 2026

ExploitGym: OpenAI’s AI Escaped Its Sandbox and Breached Hugging Face

Categories

Green Tea’s Hidden Trade-off: Memory for CPU

The DoltHub Regression: A Case Study in Algorithm Mismatch

Who Gets the Wins

What to Do Before Go 1.27 Ships

The Bottom Line

Share

You may also like

Leave a reply Cancel reply

More in:News

Categories

Latest Posts