-68%

Monthly infrastructure cost

EUR 9,408

Saved per year

Minutes of downtime

The challenge

Infrastructure that grew, but was never measured

The client operates a SaaS platform that processes email newsletter link tracking, handling high-volume bursts of traffic whenever customers send campaigns. Over several years the infrastructure had grown organically into an oversized, expensive, and fragile setup:

13 servers - a mix of cloud VMs and two dedicated bare-metal database machines
6 load balancers - half of them dead or empty, still billed every month
Two dedicated bare-metal PostgreSQL servers with overlapping roles
~EUR 1,147 / month spend that kept rising while the business did not

The setup was never measured against real usage. Capacity was guessed, then never revisited.

Before - sprawl

Database - 2 bare-metal, overlapping

pg-master (bare metal)pg-slave (bare metal)

Load balancers - 6, three dead

lb-1lb-2lb-3lb-4 DEADlb-5 DEADlb-6 EMPTY

Application VMs - 11, near-idle

web-1web-2web-3api-1api-2api-3api-4api-sec-1api-sec-2api-sec-3api-sec-4

Other

cache serverNFS servertest server (oversized)

Visually cluttered on purpose - much of this bought nothing.

The diagnosis

30 days of real metrics, not guesses

Rather than guessing, the engagement started with 30 days of real production metrics pulled from the client's monitoring platform. The data told a blunt story:

The 11 application servers ran at 0.2%-2.6% average CPU
Memory utilization sat below 15% across the fleet
3 of 6 load balancers had zero or disabled backends - pure waste
The two dedicated bare-metal servers duplicated each other's function

30-day average CPU per application server

web-1

1.8%

web-2

2.1%

web-3

0.9%

api-1

2.6%

api-2

2.3%

api-3

1.4%

api-4

0.7%

api-sec-1

0.4%

api-sec-2

0.2%

api-sec-3

0.3%

api-sec-4

0.5%

The gap between actual usage and the target line is the waste.

~69%Waste

EUR 1,147 / month, split

~EUR 787Over-provisioned / dead

~EUR 360Actually needed

Most of the monthly bill bought nothing.

The solution

A lean, cloud-native, autoscaling architecture

The infrastructure was redesigned as a lean, cloud-native, autoscaling architecture with a proper high-availability database cluster.

PostgreSQL high-availability cluster

The two dedicated bare-metal servers were replaced with a managed PostgreSQL cluster: a primary node and a physical streaming replica in a separate datacenter, continuous replication with lag monitoring, failover within minutes, and automated backups to object storage on a tiered 7d / 4w / 12m retention.

Hot-table partitioning

The two highest-traffic tables, the click and rotator event logs, were re-architected with 12-way hash partitioning, improving query performance and keeping the dataset maintainable at scale.

Zero-downtime migration

The move off bare metal used logical replication: the new cloud cluster synced live from the old database, then a brief low-traffic cutover promoted it to primary, with no service interruption.

Elastic application layer with autoscaling

The 11 application servers were consolidated to 3, placed behind cloud autoscaling. Normal traffic runs on a minimal footprint; a newsletter blast provisions extra nodes automatically, then releases them once the spike passes. The client pays for real demand, not idle worst-case capacity.

Load balancer consolidation

Six load balancers, three of them dead, were consolidated to a single properly sized load balancer.

After - lean and resilient

PostgreSQL PRIMARY« streaming »REPLICA (separate DC)→backups -> object storage

single load balancer

Autoscaling group

app-1app-2app-3+N on demand

Five boxes instead of twenty-plus. The bill now follows real traffic.

Migration phases

Phase 1

Quick wins

Phase 2

PostgreSQL migration

0 downtime cutover

Phase 3

Cleanup & handover

Phase 4

Backups

The results

The bill cut by two thirds, the platform more reliable

Monthly infrastructure cost

BeforeEUR 1,147

AfterEUR 363

-EUR 784 / month · -EUR 9,408 / year-68%

Metric	Before	After
Monthly cost	EUR 1,147	EUR 363
Annual cost	EUR 13,764	EUR 4,356
Servers	13	5
Load balancers	6	1
Database	2 bare-metal, overlapping	HA cluster (primary + replica)
Scaling	Manual, static	Automatic
Migration downtime	-	0 minutes

-68%

Monthly cost

EUR 9,408/yr

Saved

13 -> 5

Servers

6 -> 1

Load balancers

0 min

Migration downtime

HA cluster

Database resilience

Why it worked

The savings came from measuring before acting

The savings did not come from cutting corners, they came from measuring before acting. Most infrastructure waste is invisible because nobody looks. By starting with 30 days of real metrics, the engagement targeted exactly what was over-provisioned and left everything load-bearing untouched. The result is an architecture that costs less, and is more resilient and more modern than what it replaced.

Services applied

What the engagement covered

Infrastructure audit & metrics-driven capacity analysis

PostgreSQL cluster design, partitioning & zero-downtime migration

Cloud-native re-architecture with autoscaling

Cost optimization

CI/CD verification

A 68% infrastructure cost cut, with zero downtime and a more reliable platform