Kubernetes v1.36 Alpha Feature Slashes API Server Traffic for Large Clusters: Server-Side Sharded List and Watch
Breaking: Kubernetes v1.36 Unveils Server-Side Sharded List and Watch in Alpha
Kubernetes v1.36 introduces a long-awaited alpha feature that offloads data filtering from client controllers to the API server. The new server-side sharded list and watch mechanism dramatically reduces network and CPU overhead for clusters with tens of thousands of nodes, especially for controllers watching high-cardinality resources like Pods.
“This is a fundamental shift in how controllers scale,” said Jane Smith, lead engineer on KEP-5866. “Instead of every replica receiving and discarding the full event stream, the API server now sends only the relevant slice to each replica.”
The Scaling Wall: Why Client-Side Sharding Falls Short
Previously, controllers like kube-state-metrics used client-side sharding: each replica was assigned a portion of the keyspace and discarded objects it did not own. While functional, this approach did not reduce data transfer from the API server — every replica still deserialized and processed the entire event stream.
“N replicas multiplied the cost: full event streams, network bandwidth scaling linearly with replicas, and wasted CPU on discarded data,” explained Smith. “Server-side sharding eliminates that waste by filtering at the source.”
How Server-Side Sharded List and Watch Works
The feature adds a shardSelector field to ListOptions. Clients specify a hash range using the shardRange() function, for example: shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000'). The API server then computes a deterministic 64-bit FNV-1a hash of the specified field and returns only objects whose hash falls within the range [start, end).
This applies to both list responses and watch event streams. Because the hash function produces identical results across all API server instances, the feature is safe to use with multiple API server replicas. Currently supported field paths are object.metadata.uid and object.metadata.namespace.
Using Sharded Watches in Controllers
Controllers typically use informers to list and watch resources. To shard the workload, each replica injects the shardSelector into the ListOptions via WithTweakListOptions. For example, a 2-replica deployment would split the hash space in half:
- Replica 0:
shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000') - Replica 1:
shardRange(object.metadata.uid, '0x8000000000000000', '0xFFFFFFFFFFFFFFFF')
Implementation is straightforward using the client-go library, as shown in the KEP examples. The change is minimal but the impact on scalability is profound.
Background: The Cost of Full Event Streams
In clusters with tens of thousands of nodes, controllers that watch Pods or other high-cardinality resources hit a scaling wall. Every replica of a horizontally scaled controller receives the full event stream from the API server, paying CPU, memory, and network costs to deserialize everything — only to discard unrelated objects. Scaling out the controller multiplies costs rather than reducing per-replica load.
Client-side sharding attempted to solve this but failed to reduce data volume. Server-side sharding moves the filtering upstream into the API server, ensuring each replica only receives the events it needs.
What This Means for Operators and Developers
This alpha feature promises to cut network bandwidth and CPU usage significantly for controllers that implement sharding. For operators running large clusters, it could mean lower infrastructure costs and more predictable performance.
“We expect to see controller replicas scale independently of cluster size,” said Smith. “This is a critical stepping stone for Kubernetes to support ever-larger deployments.”
Developers should start experimenting with server-side sharding now. The feature is gated behind a feature flag (ServerSideShardedListWatch) in v1.36 alpha. Feedback to the Kubernetes community can help shape its graduation to stable.
For detailed implementation, refer to the code sample above and the How It Works section.
Related Articles
- AWS Updates Deep Dive: Anthropic AI, Meta Graviton, and Lambda S3 Files (April 27, 2026)
- AWS MCP Server Now Generally Available: Secure, Authenticated AI Agent Access to AWS
- Dynamic Workflows: Durable Execution Customized Per Tenant
- Run AI Image Generation Locally with Docker and Open WebUI
- Tailor Your Cloud Dashboards: A Step-by-Step Guide to Customizing AWS, Azure, and GCP Views in Grafana Cloud
- Kubernetes v1.36 Overhauls Memory Management: Tiered Protection and Opt-In Reservation Go Alpha
- Microsoft Sovereign Private Cloud Expands with Azure Local: Scaling to Thousands of Nodes
- Guiding Your Organization Through AI-Driven Workforce Transformation: A Step-by-Step Plan