Introduction
We manage dozens of Next.js applications for our clients across K3s clusters. Over time, we have built a deployment pipeline that is fast, reliable, and achieves true zero-downtime deployments. This article describes our production setup: from code push to live traffic, fully automated through CI/CD.
This is not a theoretical guide — this is what we run in production today.
Architecture Overview
Developer pushes to main branch
|
v
CI/CD pipeline triggers (GitLab CI / GitHub Actions)
|
v
Docker build + push to Container Registry (GitLab Registry / ECR / GHCR)
|
v
kubectl set image → rolling update on K3s
|
v
Cloudflare cache purge (automatic)
|
v
Traffic: Cloudflare CDN → Tunnel → K3s Service → Next.js pods
1. CI/CD Pipeline
We use GitLab CI (works the same with GitHub Actions) to build and push images to a container registry. No manual steps — push to main and the pipeline handles everything.
GitLab CI Example (.gitlab-ci.yml)
stages:
- build
- deploy
variables:
IMAGE: $CI_REGISTRY_IMAGE/app
TAG: $CI_COMMIT_SHORT_SHA
build:
stage: build
image: docker:24
services:
- docker:24-dind
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
- docker build
--cache-from $IMAGE:latest
-t $IMAGE:$TAG
-t $IMAGE:latest
-f Dockerfile .
- docker push $IMAGE:$TAG
- docker push $IMAGE:latest
only:
- main
deploy:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl set image deployment/myapp app=$IMAGE:$TAG -n myapp
- kubectl rollout status deployment/myapp -n myapp --timeout=300s
- |
curl -s -X POST \
"https://api.cloudflare.com/client/v4/zones/$CF_ZONE_ID/purge_cache" \
-H "Authorization: Bearer $CF_API_TOKEN" \
-H "Content-Type: application/json" \
--data '{"purge_everything":true}'
only:
- main
GitHub Actions Alternative
name: Deploy
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: docker/build-push-action@v5
with:
push: true
tags: ghcr.io/${{ github.repository }}/app:${{ github.sha }}
- uses: azure/k8s-set-context@v3
with:
kubeconfig: ${{ secrets.KUBECONFIG }}
- run: |
kubectl set image deployment/myapp \
app=ghcr.io/${{ github.repository }}/app:${{ github.sha }} \
-n myapp
kubectl rollout status deployment/myapp -n myapp --timeout=300s
Why Container Registry?
- Immutable tags — every deployment is traceable to a specific commit SHA
- Rollback is instant —
kubectl set imageto a previous tag, no rebuild needed - Multi-node clusters — all nodes pull from the registry, no manual image distribution
- CI/CD native — GitLab, GitHub, ECR all have built-in registries
- Cache layers — subsequent builds are fast thanks to Docker layer caching
2. K3s Cluster Configuration
Our K3s setup is deliberately simple:
# Install K3s without the default ingress controller
curl -sfL https://get.k3s.io | sh -s - \
--disable traefik \
--disable servicelb \
--write-kubeconfig-mode 644
We disable Traefik and the default service load balancer because all traffic arrives through Cloudflare Tunnel. There is no need for an ingress controller or external load balancer.
Deployment Manifest
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: myapp
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
terminationGracePeriodSeconds: 30
containers:
- name: myapp
image: registry.example.com/myapp:latest
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: "production"
- name: NODE_OPTIONS
value: "--max-old-space-size=384"
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: "1"
memory: 512Mi
readinessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 15
periodSeconds: 30
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"]
3. Cloudflare Integration
Cloudflare Tunnel
We run cloudflared as a Kubernetes deployment with 2 replicas for redundancy:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cloudflared
namespace: myapp
spec:
replicas: 2
selector:
matchLabels:
app: cloudflared
template:
metadata:
labels:
app: cloudflared
spec:
containers:
- name: cloudflared
image: cloudflare/cloudflared:latest
args:
- tunnel
- --no-autoupdate
- run
- --token
- $(TUNNEL_TOKEN)
env:
- name: TUNNEL_TOKEN
valueFrom:
secretKeyRef:
name: cf-tunnel-token
key: token
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 128Mi
CDN Caching Rules
We configure Cloudflare to cache static assets aggressively but bypass the cache for API routes and dynamic pages:
/_next/static/*: Cache for 1 year (immutable hashed filenames)./images/*,/fonts/*: Cache for 30 days./api/*: Bypass cache.- Everything else: Cache for 1 hour with
stale-while-revalidate.
Automatic Cache Purge After Deploy
The deploy script purges the entire Cloudflare cache after a successful rollout. For targeted purging, we can purge by prefix:
curl -s -X POST \
"https://api.cloudflare.com/client/v4/zones/$CF_ZONE_ID/purge_cache" \
-H "Authorization: Bearer $CF_API_TOKEN" \
-H "Content-Type: application/json" \
--data '{"prefixes":["myapp.example.com/_next/"]}'
4. Monitoring with Prometheus
We run Prometheus and Grafana on the K3s cluster to monitor application and cluster health.
Key Metrics We Track
- Request latency (P50, P95, P99) via Node.js metrics
- Pod CPU and memory usage via kube-state-metrics
- Pod restart count (indicates OOM kills or crash loops)
- HTTP error rates (5xx responses)
- Deployment rollout duration
ServiceMonitor for Next.js
We expose metrics from the Next.js application using a custom endpoint:
// app/api/metrics/route.ts
import { NextResponse } from "next/server";
let requestCount = 0;
export async function GET() {
requestCount++;
const memUsage = process.memoryUsage();
const metrics = [
`# HELP nodejs_heap_used_bytes Node.js heap used`,
`# TYPE nodejs_heap_used_bytes gauge`,
`nodejs_heap_used_bytes ${memUsage.heapUsed}`,
`# HELP nodejs_heap_total_bytes Node.js heap total`,
`# TYPE nodejs_heap_total_bytes gauge`,
`nodejs_heap_total_bytes ${memUsage.heapTotal}`,
`# HELP http_requests_total Total HTTP requests`,
`# TYPE http_requests_total counter`,
`http_requests_total ${requestCount}`,
].join("\n");
return new NextResponse(metrics, {
headers: { "Content-Type": "text/plain" },
});
}
Alerting Rules
groups:
- name: nextjs-alerts
rules:
- alert: HighMemoryUsage
expr: container_memory_usage_bytes{container="myapp"} / container_spec_memory_limit_bytes{container="myapp"} > 0.85
for: 5m
labels:
severity: warning
annotations:
summary: "Next.js pod memory above 85%"
- alert: PodRestartLoop
expr: increase(kube_pod_container_status_restarts_total{container="myapp"}[1h]) > 3
for: 0m
labels:
severity: critical
annotations:
summary: "Next.js pod restarting frequently"
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.05
for: 2m
labels:
severity: critical
annotations:
summary: "Error rate above 5%"
5. Results
With this setup, our deployments typically complete in under 3 minutes from git push to live traffic:
- CI build + push: 60-120 seconds (cached layers)
- Rolling update: 30-60 seconds
- Cache purge: 2 seconds
- Total: Under 3 minutes, fully automated
Zero requests are dropped during the rolling update thanks to the readiness probe, preStop hook, and maxUnavailable: 0 configuration. Rollbacks are instant — just point to a previous image tag.
This pipeline has been serving us and our clients reliably for over a year. The key principles: container registry for immutable, traceable deployments; K3s for lightweight Kubernetes; Cloudflare for CDN and secure tunneling; Prometheus for observability. Every deployment is automated, every image is tagged with its commit SHA, and every rollback is one command away.
Need help with this?
Our team handles this kind of work daily. Let us take care of your infrastructure.
Related Articles
Deploying Next.js 16 to Kubernetes: The Complete Production Guide
A complete guide to deploying Next.js 16 to Kubernetes in production, including multi-stage Dockerfile, K3s deployment manifests, health checks, HPA, Cloudflare Tunnel integration, environment variables, and Prisma in containers.
Next.jsNext.js on K8s: Solving the 5 Most Common Production Issues
Five common production issues when running Next.js on Kubernetes and how to fix each one: missing CSS with standalone output, image optimization in containers, ISR with shared cache, Node.js memory leaks, and graceful shutdown.
Next.jsNext.js Edge Functions for Global Performance
Leverage Next.js Edge Runtime and middleware for sub-50ms global response times with geo-routing, A/B testing, and personalized content delivery.