Skip to main content
Back to Articles
Server & DevOpsJanuary 10, 20268 min read

MySQL Master-Slave Replication Setup Guide

Configure MySQL master-slave replication for read scaling, disaster recovery, and high availability with GTID-based replication and monitoring.

Why Replication Matters

As application traffic grows, the database becomes the primary bottleneck. A single MySQL server handles both reads and writes, and read traffic typically outweighs writes by 10:1 or more. Master-slave replication distributes read queries across multiple replicas while the master handles all writes, providing both read scaling and disaster recovery.

This guide covers GTID-based replication setup, monitoring, and failover procedures that we implement when managing production servers.

Architecture

Application
    ├── Writes → Master (MySQL 8.0, 192.168.1.10)
    └── Reads  → HAProxy → Slave 1 (192.168.1.11)
                         → Slave 2 (192.168.1.12)
                         → Slave 3 (192.168.1.13)

HAProxy load-balances read traffic across healthy replicas. If a replica falls behind or becomes unavailable, HAProxy routes traffic to the remaining healthy nodes.

Master Configuration

Configure the master server's MySQL settings:

# /etc/mysql/mysql.conf.d/master.cnf
[mysqld]
server-id                = 1
log_bin                  = /var/log/mysql/mysql-bin
binlog_format            = ROW
binlog_row_image         = FULL
gtid_mode                = ON
enforce_gtid_consistency = ON
sync_binlog              = 1
innodb_flush_log_at_trx_commit = 1

# Retain binary logs for 7 days
binlog_expire_logs_seconds = 604800

# Performance tuning
innodb_buffer_pool_size  = 8G
innodb_log_file_size     = 1G
innodb_io_capacity       = 2000
max_connections          = 500

Create Replication User

CREATE USER 'repl_user'@'192.168.1.%'
  IDENTIFIED WITH caching_sha2_password BY 'strong_password_here';

GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'192.168.1.%';
FLUSH PRIVILEGES;

Verify Master Status

SHOW MASTER STATUS\G
-- Note the File and Position values
-- With GTID, you can also check:
SELECT @@global.gtid_executed;

Slave Configuration

Configure each replica:

# /etc/mysql/mysql.conf.d/slave.cnf
[mysqld]
server-id                = 2  # Unique per slave
log_bin                  = /var/log/mysql/mysql-bin
binlog_format            = ROW
gtid_mode                = ON
enforce_gtid_consistency = ON
relay_log                = /var/log/mysql/relay-bin
read_only                = ON
super_read_only          = ON
log_slave_updates        = ON

# Parallel replication
slave_parallel_workers   = 4
slave_parallel_type      = LOGICAL_CLOCK
slave_preserve_commit_order = ON

# Performance
innodb_buffer_pool_size  = 8G

Initialize Replication

Take a consistent backup from the master:

mysqldump --all-databases --single-transaction \
  --routines --triggers --events \
  --master-data=2 --set-gtid-purged=ON \
  -u root -p > master_backup.sql

Restore on the slave and start replication:

-- On the slave
SOURCE /tmp/master_backup.sql;

CHANGE MASTER TO
  MASTER_HOST='192.168.1.10',
  MASTER_USER='repl_user',
  MASTER_PASSWORD='strong_password_here',
  MASTER_AUTO_POSITION=1,
  MASTER_SSL=1;

START SLAVE;

Verify Replication Status

SHOW SLAVE STATUS\G

Key fields to check:

  • Slave_IO_Running: Yes — the I/O thread is receiving binary logs
  • Slave_SQL_Running: Yes — the SQL thread is applying events
  • Seconds_Behind_Master: 0 — replication lag
  • Retrieved_Gtid_Set and Executed_Gtid_Set should match master

HAProxy Configuration for Read Distribution

# /etc/haproxy/haproxy.cfg
frontend mysql_read
    bind *:3307
    mode tcp
    default_backend mysql_slaves

backend mysql_slaves
    mode tcp
    balance leastconn
    option mysql-check user haproxy_check

    server slave1 192.168.1.11:3306 check inter 5s fall 3 rise 2
    server slave2 192.168.1.12:3306 check inter 5s fall 3 rise 2
    server slave3 192.168.1.13:3306 check inter 5s fall 3 rise 2

The application connects to port 3306 for writes (master) and port 3307 for reads (HAProxy-balanced slaves).

Monitoring Replication Health

Automated Lag Monitoring

#!/bin/bash
# /usr/local/bin/check_replication.sh
THRESHOLD=30
LAG=$(mysql -u monitor -p'monitor_pass' -h 192.168.1.11 \
  -e "SHOW SLAVE STATUS\G" | grep "Seconds_Behind_Master" | awk '{print $2}')

if [ "$LAG" == "NULL" ]; then
  echo "CRITICAL: Replication is broken"
  exit 2
elif [ "$LAG" -gt "$THRESHOLD" ]; then
  echo "WARNING: Replication lag is ${LAG}s (threshold: ${THRESHOLD}s)"
  exit 1
else
  echo "OK: Replication lag is ${LAG}s"
  exit 0
fi

Prometheus Monitoring

Use the mysqld_exporter to expose replication metrics:

# prometheus alert rule
- alert: MySQLReplicationLag
  expr: mysql_slave_status_seconds_behind_master > 30
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "MySQL replication lag on {{ $labels.instance }}"
    description: "Slave is {{ $value }}s behind master"

Failover Procedure

When the master fails, promote a slave:

-- On the most up-to-date slave
STOP SLAVE;
RESET SLAVE ALL;
SET GLOBAL read_only = OFF;
SET GLOBAL super_read_only = OFF;

-- Point other slaves to the new master
-- On remaining slaves:
STOP SLAVE;
CHANGE MASTER TO
  MASTER_HOST='192.168.1.11',
  MASTER_AUTO_POSITION=1;
START SLAVE;

Update application configuration and HAProxy to point writes to the new master. For automated failover, consider MySQL Group Replication or Orchestrator.

Best Practices

  • Always use GTID-based replication for simpler failover
  • Enable super_read_only on slaves to prevent accidental writes
  • Monitor Seconds_Behind_Master and alert at 30 seconds
  • Test failover procedures quarterly in staging
  • Keep binary log retention at 7 days minimum for recovery flexibility
  • Use SSL/TLS for replication traffic between data centers
  • Consider semi-synchronous replication for zero data loss requirements

Properly configured replication transforms a single point of failure into a resilient, scalable database tier. For complex multi-region setups, our infrastructure management team can design and implement the right topology for your workload.

Need help with this?

Our team handles this kind of work daily. Let us take care of your infrastructure.