Skip to main content

Case Study - Cost Optimization

A 68% infrastructure cost cut, with zero downtime and a more reliable platform

A high-traffic link-tracking SaaS was paying for infrastructure it never used. We measured 30 days of real production metrics, then re-architected the platform around actual demand.

Monthly cost

BeforeEUR 1,147
AfterEUR 363
-EUR 9,408 / year-68%
Sector

B2B email link-tracking & analytics SaaS

Footprint

Burst traffic on every customer campaign

Cloud

PostgreSQL, Laravel, Hetzner Cloud

Engagement

Infrastructure migration & cost optimization

-68%
Monthly infrastructure cost
EUR 9,408
Saved per year
0
Minutes of downtime

The challenge

Infrastructure that grew, but was never measured

The client operates a SaaS platform that processes email newsletter link tracking, handling high-volume bursts of traffic whenever customers send campaigns. Over several years the infrastructure had grown organically into an oversized, expensive, and fragile setup:

  • 13 servers - a mix of cloud VMs and two dedicated bare-metal database machines
  • 6 load balancers - half of them dead or empty, still billed every month
  • Two dedicated bare-metal PostgreSQL servers with overlapping roles
  • ~EUR 1,147 / month spend that kept rising while the business did not

The setup was never measured against real usage. Capacity was guessed, then never revisited.

Before - sprawl

Database - 2 bare-metal, overlapping

pg-master (bare metal)pg-slave (bare metal)

Load balancers - 6, three dead

lb-1lb-2lb-3lb-4 DEADlb-5 DEADlb-6 EMPTY

Application VMs - 11, near-idle

web-1web-2web-3api-1api-2api-3api-4api-sec-1api-sec-2api-sec-3api-sec-4

Other

cache serverNFS servertest server (oversized)

Visually cluttered on purpose - much of this bought nothing.

The diagnosis

30 days of real metrics, not guesses

Rather than guessing, the engagement started with 30 days of real production metrics pulled from the client's monitoring platform. The data told a blunt story:

  • The 11 application servers ran at 0.2%-2.6% average CPU
  • Memory utilization sat below 15% across the fleet
  • 3 of 6 load balancers had zero or disabled backends - pure waste
  • The two dedicated bare-metal servers duplicated each other's function
30-day average CPU per application server
web-1
1.8%
web-2
2.1%
web-3
0.9%
api-1
2.6%
api-2
2.3%
api-3
1.4%
api-4
0.7%
api-sec-1
0.4%
api-sec-2
0.2%
api-sec-3
0.3%
api-sec-4
0.5%

The gap between actual usage and the target line is the waste.

EUR 1,147 / month, split

~EUR 787Over-provisioned / dead
~EUR 360Actually needed

Most of the monthly bill bought nothing.

The solution

A lean, cloud-native, autoscaling architecture

The infrastructure was redesigned as a lean, cloud-native, autoscaling architecture with a proper high-availability database cluster.

1

PostgreSQL high-availability cluster

The two dedicated bare-metal servers were replaced with a managed PostgreSQL cluster: a primary node and a physical streaming replica in a separate datacenter, continuous replication with lag monitoring, failover within minutes, and automated backups to object storage on a tiered 7d / 4w / 12m retention.

2

Hot-table partitioning

The two highest-traffic tables, the click and rotator event logs, were re-architected with 12-way hash partitioning, improving query performance and keeping the dataset maintainable at scale.

3

Zero-downtime migration

The move off bare metal used logical replication: the new cloud cluster synced live from the old database, then a brief low-traffic cutover promoted it to primary, with no service interruption.

4

Elastic application layer with autoscaling

The 11 application servers were consolidated to 3, placed behind cloud autoscaling. Normal traffic runs on a minimal footprint; a newsletter blast provisions extra nodes automatically, then releases them once the spike passes. The client pays for real demand, not idle worst-case capacity.

5

Load balancer consolidation

Six load balancers, three of them dead, were consolidated to a single properly sized load balancer.

After - lean and resilient
PostgreSQL PRIMARY« streaming »REPLICA (separate DC)backups -> object storage
single load balancer

Autoscaling group

app-1app-2app-3+N on demand

Five boxes instead of twenty-plus. The bill now follows real traffic.

Migration phases

Phase 1

Quick wins

Phase 2

PostgreSQL migration

0 downtime cutover

Phase 3

Cleanup & handover

Phase 4

Backups

The results

The bill cut by two thirds, the platform more reliable

Monthly infrastructure cost
BeforeEUR 1,147
AfterEUR 363
-EUR 784 / month · -EUR 9,408 / year-68%
MetricBeforeAfter
Monthly costEUR 1,147EUR 363
Annual costEUR 13,764EUR 4,356
Servers135
Load balancers61
Database2 bare-metal, overlappingHA cluster (primary + replica)
ScalingManual, staticAutomatic
Migration downtime-0 minutes
-68%
Monthly cost
EUR 9,408/yr
Saved
13 -> 5
Servers
6 -> 1
Load balancers
0 min
Migration downtime
HA cluster
Database resilience

Why it worked

The savings came from measuring before acting

The savings did not come from cutting corners, they came from measuring before acting. Most infrastructure waste is invisible because nobody looks. By starting with 30 days of real metrics, the engagement targeted exactly what was over-provisioned and left everything load-bearing untouched. The result is an architecture that costs less, and is more resilient and more modern than what it replaced.

Services applied

What the engagement covered

Infrastructure audit & metrics-driven capacity analysis
PostgreSQL cluster design, partitioning & zero-downtime migration
Cloud-native re-architecture with autoscaling
Cost optimization
CI/CD verification

Related service

Cloud Cost Optimization

More case studies

Facing something similar?

A short conversation is usually enough to tell whether it is worth a deeper look. No commitment to start.