Why Zero-Downtime Matters

In production environments serving millions of requests, even a few seconds of downtime during a deployment can result in failed transactions, broken user sessions, and SLA violations. Kubernetes provides powerful primitives for zero-downtime deployments, but using them correctly requires understanding rolling updates, readiness probes, and traffic management at the pod level.

This guide walks through three proven deployment strategies and the configuration details that make each one reliable in high-traffic production clusters.

Rolling Update Strategy

Rolling updates are the default deployment strategy in Kubernetes. The controller gradually replaces old pods with new ones, ensuring that a minimum number of pods remain available throughout the process.

Deployment Configuration

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 0
  template:
    spec:
      containers:
      - name: api
        image: registry.example.com/web-api:v2.4.1
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /livez
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 10
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 10"]

Setting maxUnavailable: 0 guarantees that the current replica count never drops below the desired number. The maxSurge: 2 allows Kubernetes to create two extra pods at a time, speeding up the rollout without risking availability.

The preStop Hook

The preStop lifecycle hook with a sleep command is critical. When a pod receives a termination signal, Kubernetes simultaneously removes it from the Service endpoints and sends SIGTERM. Without the sleep, in-flight requests routed before the endpoint removal can hit a terminating pod. The 10-second delay gives the kube-proxy and ingress controllers time to update their routing tables.

Blue-Green Deployments

Blue-green deployments run two complete environments side by side. Traffic switches from the "blue" (current) environment to the "green" (new) environment in a single operation.

Implementation with Services

apiVersion: v1
kind: Service
metadata:
  name: web-api
spec:
  selector:
    app: web-api
    version: green
  ports:
  - port: 80
    targetPort: 8080

Deploy the green environment with the new version label. Once readiness probes confirm all green pods are healthy, update the Service selector from version: blue to version: green. The switch is atomic at the Service level, meaning traffic shifts within seconds.

Rollback Procedure

Rolling back is equally fast. Change the selector back to version: blue and the previous environment handles all traffic immediately. Keep the blue deployment running for at least 30 minutes after a successful switch to allow quick rollbacks.

Canary Releases

Canary releases route a small percentage of traffic to the new version, letting you validate behavior under real production load before a full rollout.

Using Ingress Annotations

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-api-canary
  annotations:
    nginx.ingress.kubernetes.io/canary: "true"
    nginx.ingress.kubernetes.io/canary-weight: "10"
spec:
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-api-canary
            port:
              number: 80

Start with 10 percent traffic, monitor error rates and latency in your observability stack, and gradually increase the weight to 25, 50, and finally 100 percent.

Pod Disruption Budgets

Regardless of strategy, always define a PodDisruptionBudget to protect against voluntary disruptions such as node drains during cluster upgrades:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-api-pdb
spec:
  minAvailable: 4
  selector:
    matchLabels:
      app: web-api

Best Practices Checklist

Always set maxUnavailable: 0 for critical services
Configure readiness probes that check actual application health, not just TCP port availability
Use preStop hooks to drain connections gracefully
Set terminationGracePeriodSeconds long enough for in-flight requests to complete
Monitor deployment progress with kubectl rollout status
Automate rollback triggers based on error rate thresholds in your CI/CD pipeline

Zero-downtime deployments require more than just Kubernetes defaults. Proper probe configuration, lifecycle hooks, and disruption budgets work together to ensure seamless releases. For complex microservice architectures, consider our Kubernetes management services to implement these patterns at scale across your clusters.

Talk to the engineer who will own your stack.

No account managers, no offshore handoff. Senior DevOps, direct. Tell us what you are dealing with and you get a straight answer.

View Related Service Discuss

Server & DevOps

Zero-Downtime Kubernetes Deployments: Complete Guide