Replication is a process to create redundant resources those are consistent each other. The goal of replication is either to improve reliability, accessibility or fault-tolerance. There are some ways to do database replication:
- Master-Master (multi-master) replication: all nodes can execute read-write queries.
More complex in terms of resolving conflicts, so if we do it eagerly, it will be slower.
In PostgreSQL, there are some tools: BDR (asynchronous) and Bucardo (asynchronous). - Master-Slave replication: one master node for writing and all nodes can execute read queries.
Simple, but if you have heavy write access, it will have bottlenecks in the master node.
In PostgreSQL, there are some tools: WAL (Write-Ahead-Log, built-in PostgreSQL feature, full database replication), repmgr (wrapper for WAL and built-in streaming replication), Slony-I and pglogical (selective replication).
In this article, I will explain how to set up master-slave replication using repmgr. repmgr can create a full database replication and be a hot standby (or streaming replicated) server.
Here, I will set up a single master node and a single slave replication. The slave database can be used as read replica. I will use Ubuntu 14.04 and PostgreSQL 9.5. Some parts of the documentation here are taken from the repmgr documentation with some additional missing steps.
Step 1: Install PostgreSQL 9.5 and repmgr to master and slave servers
In order to get the latest update for PostgreSQL, we need to add PostgreSQL Apt Repository and then install the database.
POSTGRES_VERSION=9.5 echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -sc)-pgdg main" | sudo tee -a /etc/apt/sources.list.d/pgdg.list wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - sudo apt-get -y update sudo apt-get -y install postgresql-${POSTGRES_VERSION} postgresql-${POSTGRES_VERSION}-repmgr
Step 2: Allow nodes to SSH each other
In the next step, we will clone master database using repmgr. To do cloning, slave node must be able to SSH into its upstream (i.e. master) node. Here, we make postgres user in both server talk each other. You can use other method, but make sure each postgres user in all nodes can SSH each other.
In your local machine, create SSH key pairs:
mkdir -p /tmp/postgres_keypairs cd /tmp/postgres_keypairs ssh-keygen -f ./id_rsa -t rsa -N '' cp id_rsa.pub authorized_keys tar zcvf ../postgres_keypairs.tar.gz ./ scp ../postgres_keypairs.tar.gz ubuntu@repmgr_node1:~/ scp ../postgres_keypairs.tar.gz ubuntu@repmgr_node2:~/ cd .. rm -r postgres_keypairs postgres_keypairs.tar.gz
Adjust scp commands so it will copy the key pairs into all your database cluster nodes, which is master and slave.
In the master node, run the following commands:
mkdir -p /tmp/postgres_keypairs tar zxvf ~/postgres_keypairs.tar.gz -C /tmp/postgres_keypairs sudo chown -R postgres:postgres /tmp/postgres_keypairs sudo chmod go-rwx /tmp/postgres_keypairs/* sudo -H -u postgres sh -c 'rm -rf ~/.ssh && mv /tmp/postgres_keypairs ~/.ssh' rm ~/postgres_keypairs.tar.gz
In the slave node, do the same thing:
mkdir -p /tmp/postgres_keypairs tar zxvf ~/postgres_keypairs.tar.gz -C /tmp/postgres_keypairs sudo chown -R postgres:postgres /tmp/postgres_keypairs sudo chmod go-rwx /tmp/postgres_keypairs/* sudo -H -u postgres sh -c 'rm -rf ~/.ssh && mv /tmp/postgres_keypairs ~/.ssh' rm ~/postgres_keypairs.tar.gz
In the master node, SSH to slave node using postgres user:
sudo -u postgres ssh -o StrictHostKeyChecking=no -A postgres@repmgr_node2 uptime
In the slave node, SSH to master node using postgres user:
sudo -u postgres ssh -o StrictHostKeyChecking=no -A postgres@repmgr_node1 uptime
Step 3: Configure Master – PostgreSQL configuration
As I said wrote before, repmgr basically is just a wrapper for built-in WAL and streaming replication. So, we need to activate both features by editing PostgreSQL configuration (it is in /etc/postgresql/9.5/main/postgresql.conf
by default).
POSTGRES_VERSION=9.5 sudo -u postgres vim /etc/postgresql/${POSTGRES_VERSION}/main/postgresql.conf
Update postgresql.conf
so it has the following configuration.
# Make PostgreSQL listen on all IP address. # If you want to make it more secure, you can make your database listen to # some specific IP addresses. But, make sure database cluster can connect # to this server. listen_addresses = '*' # Ensure WAL files contain enough information to enable read-only queries # on the standby wal_level = 'hot_standby' # Enable up to 10 replication connections max_wal_senders = 10 # How much WAL to retain on the master to allow a temporarily # disconnected standby to catch up again. The larger this is, the # longer the standby can be disconnected. This is needed in # prior to 9.4; from 9.4, replication slots can be used instead (see below). # # NOTE: A logfile segment might be take 16MB. Ensure you have enough # free disk capacity by multiplying 16MB with this value. wal_keep_segments = 100 # Enable read-only queries on a standby # (Note: this will be ignored on a master but we recommend including # it anyway) hot_standby = on # If archive_mode is enabled, check that 'archive_command' is non empty # (however it's not practical to check that it actually represents a valid # command). # # From PostgreSQL 9.5, archive_mode can be one of 'off', 'on' or 'always' # so for ease of backwards compatibility, rather than explicitly check for an # enabled mode, check that it's not "off". archive_mode = on # Set archive command to a script or application that will safely store # you WALs in a secure place. /bin/true is an example of a command that # ignores archiving. Use something more sensible, like wal-e to backup # the archieves. archive_command = '/bin/true' # Replication slots were introduced with PostgreSQL 9.4 and # are designed to ensure that any standby connected to the master # using a replication slot will always be able to retrieve # the required WAL files. # # This removes the need to manually manage WAL file retention by estimating # the number of WAL files that need to be maintained on the master # using wal_keep_segments. # # NOTE: Enable only in version 9.4 and above. Do however be aware that # if a standby is disconnected, WAL will continue to accumulate on # the master until either the standby reconnects or the replication slot # is dropped. max_replication_slots = 5
Step 4: Configure Master – Create a replication user
Create a dedicated PostgreSQL superuser account and a database for the repmgr
metadata, e.g.
sudo -u postgres createuser -s repmgr sudo -u postgres createdb repmgr -O repmgr
Ensure the repmgr
user has appropriate permissions in pg_hba.conf
and can connect in replication mode.
POSTGRES_VERSION=9.5 sudo -u postgres vim /etc/postgresql/${POSTGRES_VERSION}/main/pg_hba.conf
Put the following lines to the top of pg_hba.conf
:
local replication repmgr trust host replication repmgr 127.0.0.1/32 trust host replication repmgr 192.168.1.0/24 trust local repmgr repmgr trust host repmgr repmgr 127.0.0.1/32 trust host repmgr repmgr 192.168.1.0/24 trust
Adjust according to your network environment and authentication requirements. Restart PostgreSQL server.
sudo service postgresql restart
Step 5: Configure Master – repmgr configuration
Create repmgr configuration file in /etc/repmgr.conf
.
sudo vim /etc/repmgr.conf
Add the following configuration into repmgr.conf
:
# An arbitrary name for the replication cluster; # This must be identical on all nodes cluster = 'test' # A unique integer identifying the node node = 1 # A unique string identifying the node; # Avoid names indicating the current replication role like 'master' or 'standby' # as the server's role could change. node_name = 'node1' # A valid connection string for the repmgr database on the current server. conninfo = 'host=repmgr_node1 user=repmgr dbname=repmgr' # Use replication slot if you enable replication slots in # PostgreSQL configuration. use_replication_slots = 1
repmgr
will create a schema named after the cluster and prefixed with repmgr_
, e.g. repmgr_test
; we also recommend that you set the repmgr
user’s search path to include this schema name, e.g.
ALTER USER repmgr SET search_path TO repmgr_test, "$user", public;
Step 6: Configure Master – Initialise the master server
To enable repmgr
to support a replication cluster, the master node must be registered with repmgr
, which creates therepmgr
database and adds a metadata record for the server:
sudo -u postgres repmgr -f /etc/repmgr.conf master register
Run the following queries to check the initialize process result.
SELECT * FROM repmgr_test.repl_nodes;
The metadata record looks like this:
id | type | upstream_node_id | cluster | name | conninfo | slot_name | priority | active ----+--------+------------------+---------+-------+---------------------------------------------+---------------+----------+-------- 1 | master | | test | node1 | host=repmgr_node1 user=repmgr dbname=repmgr | repmgr_slot_1 | 100 | t (1 row)
Each server in the replication cluster will have its own record and will be updated when its status or role changes.
Step 7: Configure Slave – repmgr configuration
Create repmgr configuration file in /etc/repmgr.conf
for slave.
sudo vim /etc/repmgr.conf
The configuration is slightly same as the master node. Add the following configuration into repmgr.conf
:
cluster = 'test' node = 2 node_name = 'node2' conninfo = 'host=repmgr_node2 user=repmgr dbname=repmgr' use_replication_slots = 1
Step 8: Configure Slave – clone master database
We need to stop PostgreSQL server instance in the slave node and remove PostgreSQL data directory. After that, we can clone master database into the slave.
POSTGRES_VERSION=9.5 sudo service postgresql stop sudo -u postgres rm -r /var/lib/postgresql/${POSTGRES_VERSION}/main sudo -u postgres repmgr -h repmgr_node1 -U repmgr -d repmgr -D /var/lib/postgresql/${POSTGRES_VERSION}/main -f /etc/repmgr.conf standby clone sudo mv /var/lib/postgresql/${POSTGRES_VERSION}/main/pg_hba.conf /etc/postgresql/${POSTGRES_VERSION}/main/pg_hba.conf sudo mv /var/lib/postgresql/${POSTGRES_VERSION}/main/postgresql.conf /etc/postgresql/${POSTGRES_VERSION}/main/postgresql.conf sudo mv /var/lib/postgresql/${POSTGRES_VERSION}/main/pg_ident.conf /etc/postgresql/${POSTGRES_VERSION}/main/pg_ident.conf sudo service postgresql start
Adjust repmgr_node1
into master’s IP address/domain name in the last command. If we run the last command, we will see log output something like:
[2016-04-28 10:28:51] [NOTICE] destination directory '/var/lib/postgresql/9.5/main' provided [2016-04-28 10:28:51] [NOTICE] starting backup (using pg_basebackup)... [2016-04-28 10:28:51] [HINT] this may take some time; consider using the -c/--fast-checkpoint option NOTICE: pg_stop_backup complete, all required WAL segments have been archived [2016-04-28 10:28:53] [NOTICE] copying configuration files from master receiving incremental file list postgresql.conf 23,275 100% 22.20MB/s 0:00:00 (xfr#1, to-chk=0/1) receiving incremental file list pg_hba.conf 5,027 100% 4.79MB/s 0:00:00 (xfr#1, to-chk=0/1) receiving incremental file list pg_ident.conf 1,636 100% 1.56MB/s 0:00:00 (xfr#1, to-chk=0/1) [2016-04-28 10:28:55] [NOTICE] standby clone (using pg_basebackup) complete [2016-04-28 10:28:55] [NOTICE] you can now start your PostgreSQL server [2016-04-28 10:28:55] [HINT] for example : pg_ctl -D /var/lib/postgresql/9.5/main start [2016-04-28 10:28:55] [HINT] After starting the server, you need to register this standby with "repmgr standby register"
This will clone the PostgreSQL data directory files from the master at repmgr_node1 using PostgreSQL’s pg_basebackup
utility. A recovery.conf
file containing the correct parameters to start streaming from this master server will be created automatically, and unless otherwise the postgresql.conf
and pg_hba.conf
files will be copied from the master.
Step 9: Verify replication is running on master node
We need to ensure that our setup is correct, which is the replication is running properly. Run the following SQL query in the master database.
repmgr=# SELECT * FROM pg_stat_replication; -[ RECORD 1 ]----+------------------------------ pid | 7455 usesysid | 16712 usename | repmgr application_name | node2 client_addr | 10.130.27.55 client_hostname | client_port | 58687 backend_start | 2016-04-28 10:03:55.103915-04 backend_xmin | state | streaming sent_location | 0/15000288 write_location | 0/15000288 flush_location | 0/15000288 replay_location | 0/15000288 sync_priority | 0 sync_state | async
Step 10: Register Slave node
Register the slave node as standby server with:
sudo -u postgres repmgr -f /etc/repmgr.conf standby register [2016-04-28 10:40:27] [NOTICE] standby node correctly registered for cluster test with id 2 (conninfo: host=repmgr_node2 user=repmgr dbname=repmgr)
Connect to the slave node’s repmgr
database and check the repl_nodes
table:
repmgr=# SELECT * FROM repmgr_test.repl_nodes ORDER BY id; id | type | upstream_node_id | cluster | name | conninfo | slot_name | priority | active ----+---------+------------------+---------+-------+---------------------------------------------+---------------+----------+-------- 1 | master | | test | node1 | host=repmgr_node1 user=repmgr dbname=repmgr | repmgr_slot_1 | 100 | t 2 | standby | 1 | test | node2 | host=repmgr_node2 user=repmgr dbname=repmgr | repmgr_slot_2 | 100 | t (2 rows)
The standby server now has a copy of the records for all servers in the replication cluster. Note that the relationship between master and standby is explicitly defined via the upstream_node_id
value, which shows here that the standby’s upstream server is the replication cluster master.
Now we already have a master-slave replication. Read more about repmgr in the Github repository (https://github.com/2ndQuadrant/repmgr). Things you can do using repmgr:
- Cascading replication: enables a standby server to replicate from another standby server rather than directly from the master.
- Promoting a standby server: makes a standby server as a master server.
- Following a new master server: makes a standby server to replicate another master server.
- Automatic failover: detects a broken node and automatically makes a standby server to master and the others follow the new master.
- etc.
In the next article series, I will tell you how to set up a scheduled binary backup using wal-e.
Hi
Thanks for the writeup. There’s an error in “Step 8: Configure Slave – clone master database” – if you move the “recovery.conf” file out of the data directory as suggested then start PostgreSQL, it won’t start in recovery mode, so will become a standalone server. Even if your distribution places other “.conf” files outside of the data directory, “recovery.conf” must be in the data directory to be effective.
LikeLike
Hi Ian, thanks for noticing this issue. I already removed that part.
LikeLike
brilliant topic
i was searching for few days with no good result i was about to give up
so thank you for that tutorial
finally i want tot ask about if the master server is down is the slave will be up automatically
and when acting on the slave server during the downtime of the master server what should i do to sync again between slave and master or it will be automatically when it becomes up
Hope i can communicate with you via chat
Thanks a lot
LikeLike
hi edward
im always have error like below
ERROR: unable to connect to local node 2 (“node2”) and no master connection parameters provided
i do configure perfectly no missed, but alywas have error like above. but im using postgresq9.4
any suggest ?
LikeLike
Same was happening to me.
I added the pg_hba.conf (step 4) to the slave.
And also these 2 lines:
listen_addresses = ‘*’
hot_standby = on
to the slave postgresql.conf
And it worked fine
LikeLike
In Step 8 (Cloning), it fails if the ssh port is not the default (22). How to correct this?
LikeLike