Database Replication Setup

Replication Overview

Database replication creates copies of your data across multiple servers, providing fault tolerance, load distribution, and geographic distribution capabilities.

Replication Benefits

High Availability: Automatic failover to replica if primary fails
Read Scaling: Distribute read queries across replicas
Disaster Recovery: Geographic distribution of data
Zero-Downtime Maintenance: Perform updates on replicas first
Analytics Isolation: Run reports on replicas without impacting production

PostgreSQL Replication

Streaming Replication Setup

Primary Server Configuration:

# postgresql.conf
wal_level = replica
max_wal_senders = 10
wal_keep_segments = 64
archive_mode = on
archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'

# pg_hba.conf
host replication replicator 192.168.1.0/24 md5

Create Replication User:

CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password';

Replica Server Setup:

# Stop PostgreSQL and clear data directory
systemctl stop postgresql
rm -rf /var/lib/postgresql/13/main/*

# Base backup from primary
pg_basebackup -h primary_host -U replicator -D /var/lib/postgresql/13/main -Fp -Xs -P -R

# Start replica
systemctl start postgresql

Logical Replication

For selective table replication and cross-version compatibility:

# On publisher
CREATE PUBLICATION my_pub FOR TABLE users, orders;

# On subscriber
CREATE SUBSCRIPTION my_sub 
CONNECTION 'host=primary_host dbname=mydb user=replicator' 
PUBLICATION my_pub;

Monitoring Replication

# Check replication status
SELECT * FROM pg_stat_replication;

# On replica - check lag
SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;

MySQL Replication

Master-Slave Replication

Master Configuration:

# my.cnf
[mysqld]
server-id = 1
log_bin = mysql-bin
binlog_format = ROW
expire_logs_days = 7
max_binlog_size = 100M

# Create replication user
CREATE USER 'replicator'@'%' IDENTIFIED BY 'secure_password';
GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%';
FLUSH PRIVILEGES;

Slave Configuration:

# my.cnf
[mysqld]
server-id = 2
relay_log = relay-bin
read_only = 1

# Configure slave
CHANGE MASTER TO
  MASTER_HOST='master_host',
  MASTER_USER='replicator',
  MASTER_PASSWORD='secure_password',
  MASTER_LOG_FILE='mysql-bin.000001',
  MASTER_LOG_POS=154;

START SLAVE;

MySQL Group Replication

Multi-master setup with automatic failover:

# Install plugin
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

# Configure
SET GLOBAL group_replication_group_name = '8a84f357-aaa4-11e6-86c2-0800277e33ae';
SET GLOBAL group_replication_start_on_boot = ON;
SET GLOBAL group_replication_local_address = 'server1:33061';
SET GLOBAL group_replication_group_seeds = 'server1:33061,server2:33061,server3:33061';

# Bootstrap group (first member only)
SET GLOBAL group_replication_bootstrap_group = ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group = OFF;

MongoDB Replication

Replica Set Configuration

# Start MongoDB with replica set
mongod --replSet rs0 --bind_ip localhost,192.168.1.100

# Initialize replica set
rs.initiate({
  _id: "rs0",
  members: [
    { _id: 0, host: "mongo1:27017", priority: 2 },
    { _id: 1, host: "mongo2:27017", priority: 1 },
    { _id: 2, host: "mongo3:27017", priority: 1 }
  ]
})

# Add arbiter for odd number voting
rs.addArb("mongo4:27017")

Read Preference Configuration

// Application connection string
mongodb://mongo1:27017,mongo2:27017,mongo3:27017/mydb?replicaSet=rs0&readPreference=secondaryPreferred

// In application
db.collection.find().readPref('secondary')

Monitoring Replica Set

# Check status
rs.status()

# Check replication lag
rs.printSlaveReplicationInfo()

# Check oplog size
rs.printReplicationInfo()

Redis Replication

Master-Replica Setup

Replica Configuration:

# redis.conf on replica
replicaof master_host 6379
masterauth master_password
replica-read-only yes
replica-serve-stale-data yes

Redis Sentinel for HA

# sentinel.conf
port 26379
sentinel monitor mymaster 192.168.1.100 6379 2
sentinel auth-pass mymaster master_password
sentinel down-after-milliseconds mymaster 5000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 10000

# Start sentinel
redis-sentinel /etc/redis/sentinel.conf

Redis Cluster

# Create cluster
redis-cli --cluster create \
  192.168.1.101:6379 192.168.1.102:6379 192.168.1.103:6379 \
  192.168.1.104:6379 192.168.1.105:6379 192.168.1.106:6379 \
  --cluster-replicas 1

Cross-Database Replication

Change Data Capture (CDC)

Tool	Source DBs	Target DBs	Features
Debezium	MySQL, PostgreSQL, MongoDB	Kafka, Elasticsearch	Real-time CDC, schema evolution
AWS DMS	Most major databases	AWS services	Managed service, continuous sync
GoldenGate	Oracle, MySQL, PostgreSQL	Multiple targets	Enterprise features, filtering

Debezium Setup Example

# Kafka Connect configuration
{
  "name": "mysql-connector",
  "config": {
    "connector.class": "io.debezium.connector.mysql.MySqlConnector",
    "database.hostname": "mysql_host",
    "database.port": "3306",
    "database.user": "debezium",
    "database.password": "password",
    "database.server.id": "184054",
    "database.server.name": "dbserver1",
    "database.include.list": "inventory",
    "database.history.kafka.bootstrap.servers": "kafka:9092",
    "database.history.kafka.topic": "schema-changes.inventory"
  }
}

Replication Topologies

Common Patterns

Master-Slave: One primary, multiple read replicas
Master-Master: Multi-write with conflict resolution
Cascading: Replicas have their own replicas
Ring: Circular replication for geographic distribution
Star: Central hub with multiple spokes

Choosing a Topology

Topology	Use Case	Pros	Cons
Master-Slave	Read scaling	Simple, no conflicts	Single write point
Master-Master	Geographic distribution	Write anywhere	Conflict resolution
Cascading	Bandwidth optimization	Reduced master load	Increased lag

Failover and Recovery

Automatic Failover Tools

PostgreSQL: Patroni, repmgr, pg_auto_failover
MySQL: MHA, Orchestrator, ProxySQL
MongoDB: Built-in replica set elections
Redis: Redis Sentinel

Manual Failover Process

Verify primary is truly down
Ensure replica is fully synchronized
Promote replica to primary
Redirect application connections
Reconfigure other replicas
Repair and rejoin old primary as replica

Split-Brain Prevention

Use odd number of nodes for quorum
Implement STONITH (Shoot The Other Node In The Head)
Use witness/arbiter nodes
Configure proper fencing mechanisms

Performance Optimization

Replication Lag Reduction

Parallel replication: Apply changes in parallel threads
Compression: Reduce network bandwidth usage
Dedicated network: Separate replication traffic
SSD storage: Faster I/O for replay

Monitoring Metrics

Replication lag (seconds behind master)
Replication throughput (MB/s)
Queue sizes (relay log, oplog)
Network latency between nodes
Conflict resolution frequency

Best Practices

Security: Use SSL/TLS for replication traffic
Monitoring: Alert on replication lag exceeding thresholds
Testing: Regular failover drills
Documentation: Maintain runbooks for common scenarios
Capacity: Ensure replicas can handle full load
Backup: Don't rely solely on replication for backups