Restoring wal-e PostgreSQL database backup

If you have a backup, you must be able to restore it. In the previous post, I wrote about creating scheduled backup using wal-e. Now, we will restore the wal-e backup.

NOTE: This post is intended to restore PostgreSQL base backup that has no user defined tablespace. If you defined that, you can read more in here.

Step 1: Install PostgreSQL and wal-e

I assumed that you already install PostgreSQL and wal-e. You need to install same PostgreSQL major version and all its extensions that you use to create the base backup. FYI, PostgreSQL has a different binary file format for each major version. Postgres major version is determined by the first and second group number of the version number (e.g: 8.4, 9.0, 9.5, etc.).

Before we proceed to the next step, you need to stop the PostgreSQL server and remove the data directory.

sudo service postgresql stop
sudo rm -rf /var/lib/postgresql/${POSTGRES_VERSION}/main

If you haven’t install wal-e, you can follow this on step 3.

Step 2: Download Base Backup

Now, we can download our base backup:

sudo -u postgres envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-fetch /var/lib/postgresql/${POSTGRES_VERSION}/main LATEST

The above command will download the latest base backup that you have. You can list your base backup version by running

sudo -u postgres envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-list

Run the backup-fecth and replace LATEST argument with the base backup name from backup-list.

Step 3: Configure PostgreSQL recovery mode

Edit your PostgreSQL recovery configuration file:

sudo -u postgres vim /var/lib/postgresql/${POSTGRES_VERSION}/main/recovery.conf

Add the following configuration:

# The command that will be executed while recovering
restore_command = 'envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-fetch "%f" "%p"'

# You can specify this value with exact time. This will be useful if
# you have incident and you want to recover to a few moments before.
# recovery_target_time = '2016-06-02 06:18:00'

Start the PostgreSQL database again.

sudo service postgresql start

Now you will have a fully running recovery database.

Set up scheduled PostgreSQL backup to AWS S3 using wal-e

In the previous article, I wrote a guide how to set up PostgreSQL master-slave database replication. It is good to have hot standby database server. You can make your slave servers as read-replica to reduce master server load. In case your primary (master) server failed, you still have another server that is ready to be a new master.

Why should I have a cold database backup? Imagine someone accidentally delete your table (let’s say DROP TABLE users;). The drop query is going to be replicated across all hot standby databases. Your data will be gone. To mitigate this issue, we can set up a backup (i.e snapshot) of your database. So in this incident, you can restore your data. You can backup it weekly, daily or hourly, it depends on your RPO/Recovery Point Objective.

There are a few PostgreSQL backup tools. One of them is wal-e. wal-e can perform continuous archiving PostgreSQL WAL files and base backups. The archives are compressed using lzop. The compression is slightly bigger than gzip, but the compressing takes lower CPU usage.


UPDATE (2016/06/02): As per bluetreble’s comment below, the backup retention will be done by AWS S3 Lifecycle management and remove IAM user privilege to delete objects in S3 bucket. And also the creating first base backup will be done after configuring archiving first to make sure it can also be restored.


Backup on AWS S3

For saving backup files, wal-e support AWS S3, Azure Blob Storage, and OpenStack Swift. Currently, Google Cloud Storage is not supported, but there is a pull request to support it.

I personally prefer to use AWS S3. AWS S3 is a secure, durable and highly-scalable cloud storage managed by Amazon. All objects in AWS S3 are replicated across multiple Availability Zones (AZs) in a region. At the time of writing, AWS ensured the durability of your files up to 99.999999999% (eleven-9s).

Tips using AWS S3 for backup:

  • Choosing region. Prefer to backup on a different region than existing original database. Just in case a failure in primary database region, you still have cold backups in another region.
  • Choosing storage class. AWS S3 has 4 storage classes: Standard, Standard – Infrequent Access (IA), Reduced Redundancy Storage (RRS), and Glacier. I suggest using either Standard or Standard – IA. Both have 99.999999999% durability. Standard has 99.99% availability, but Standard – IA has only 99.9% with lower cost. I do not recommend RRS since it only has 99.99% durability and Glacier may take hours to restore the data.

Please take a note, if you put your data to a different region, there will be additional AWS data transfer charge.

Step 1: Create AWS S3 bucket and set up its lifecycle

First, we need to create a new S3 bucket to store your database backup.

  1. Open AWS S3 Management Console.
  2. Click Create Bucket.
  3. Fill the bucket name and the region where you want to keep the backup. I assumed that the S3 bucket name is pgbackup.
  4. Click Create.

Then, we will enable the versioning and setup lifecycle of the bucket:

  1. Open AWS S3 Management Console.
  2. Right click on your bucket name, and click Properties.
  3. In the Versioning menu, click on Enable Versioning button and click OK for the confirmation dialog.
  4. In the Lifecycle menu,
    1. Click Add rule and a new dialog will appear.
    2. In the Step 1, choose Whole Bucket and click Configure Rule.
    3. In the Step 2, check on Expire and fill its value to 8 days. You can adjust the value based on your requirement. In my case, I want to keep my backup for 7 days and I will create base backup daily, so I just use 7 + 1 = 8 days as the value.
    4. Still on the Step 2, check on Permanently Delete and fill its value to 1 day. This will keep you backup for extra 1 day after it marked as expired.
    5. Click Review. You can name this rule with intuitive name like `Database retention rule`
    6. Click Create and Activate Rule.

Step 2: Create AWS IAM user

Create a new IAM user. Take a note of the generated access key and secret key. After that, set inline policy to your new user with the following policy:

  "Version": "2012-10-17",
  "Statement": [
      "Effect": "Allow",
      "Action": ["s3:ListBucket"],
      "Resource": ["arn:aws:s3:::pgbackup"]
      "Effect": "Allow",
      "Action": [
      "Resource": ["arn:aws:s3:::pgbackup/*"]

Don’t forget to adjust the bucket name. The above policy is to allow the principal (user) to CRUD the backup bucket.

If you host your database server on AWS EC2 instances and an IAM role has been attached to it, you can skip creating IAM user, just add a new inline policy. Using IAM role is more recommended, but you can only attach it to a new EC2 instance.

Step 3: Install and configure wal-e

You can install wal-e to any servers that you want to backup. Run the following commands to install wal-e:

sudo apt-get -y update
sudo apt-get -y install daemontools python-dev lzop pv
curl -O
sudo python
export LC_ALL=C
sudo pip install -U six
sudo pip install -U requests
sudo pip install wal-e

Configure environment variable for wal-e

umask u=rwx,g=rx,o=
sudo mkdir -p /etc/wal-e.d/env
echo '<AWS_REGION>' | sudo tee /etc/wal-e.d/env/AWS_REGION
echo '<AWS_ACCESS_KEY_ID>' | sudo tee /etc/wal-e.d/env/AWS_ACCESS_KEY_ID
echo '<AWS_SECRET_ACCESS_KEY>' | sudo tee /etc/wal-e.d/env/AWS_SECRET_ACCESS_KEY
echo 's3://pgbackup/<NODE_NAME>' | sudo tee /etc/wal-e.d/env/WALE_S3_PREFIX 
sudo chown -R root:postgres /etc/wal-e.d

Change <AWS_REGION> to your S3 bucket region (e.g: us-east-1). Adjust <AWS_SECRET_ACCESS_KEY> and <AWS_ACCESS_KEY_ID> to the key pairs, you git after creating a new IAM user. <NODE_NAME> parameter is optional, use it whenever you put your backups from multiple nodes to a single bucket.

If you use IAM role, you do not need to put <AWS_SECRET_ACCESS_KEY> and <AWS_ACCESS_KEY_ID>. wal-e can use your instance’s IAM role. Just append --aws-instance-profile to all wal-e commands below.

Step 4: Configure PostgreSQL

Not only create a base backup, wal-e can archive PostgreSQL WAL too. We need to configure PostgreSQL to trigger wal-e whenever there is a new WAL archive. Edit your PostgreSQL configuration using your favorite text editor:

sudo vim /etc/postgresql/${POSTGRES_VERSION}/main/postgresql.conf

Add/update the following configuration:

# wal_level determines how much information is written to the WAL.
# For backup, you need to set this value as 'hot_standby' or 'archive'
# The 'archive' adds logging required for WAL archiving,
# and the 'hot_standby' further adds information required to run read-only queries on a standby server.
wal_level = 'hot_standby'

# Enable archiving
archive_mode = on

# Command that will be triggered after the WAL archive segment ready
archive_command = 'envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-push %p'

# The archive_timeout setting will force to swicth WAL archive segment
# if the WAL segments is not completed within this periods.
archive_timeout = 60 # seconds

Restart PostgreSQL server:

sudo service postgresql restart


Now, you can test you wal-e configuration to create a base backup:

sudo -u postgres /usr/bin/envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-push /var/lib/postgresql/${POSTGRES_VERSION}/main


The output might look like:

wal_e.main INFO MSG: starting WAL-E
 DETAIL: The subcommand is "backup-push".
 STRUCTURED: time=2016-05-24T14:34:51.764264-00 pid=26204
wal_e.operator.backup INFO MSG: start upload postgres version metadata
 DETAIL: Uploading to s3://pgbackup/node1/basebackups_005/base_000000010000000000000004_00000040/extended_version.txt.
 STRUCTURED: time=2016-05-24T14:34:52.490033-00 pid=26204
wal_e.operator.backup INFO MSG: postgres version metadata upload complete
 STRUCTURED: time=2016-05-24T14:34:52.580803-00 pid=26204
wal_e.worker.upload INFO MSG: beginning volume compression
 DETAIL: Building volume 0.
 STRUCTURED: time=2016-05-24T14:34:52.664456-00 pid=26204
wal_e.worker.upload INFO MSG: begin uploading a base backup volume
 DETAIL: Uploading to "s3://pgbackup/node1/basebackups_005/base_000000010000000000000004_00000040/tar_partitions/part_00000000.tar.lzo".
 STRUCTURED: time=2016-05-24T14:34:53.282628-00 pid=26204
wal_e.worker.upload INFO MSG: finish uploading a base backup volume
 DETAIL: Uploading to "s3://pgbackup/node1/basebackups_005/base_000000010000000000000004_00000040/tar_partitions/part_00000000.tar.lzo" complete at 7570.78KiB/s. 
 STRUCTURED: time=2016-05-24T14:34:53.787137-00 pid=26204
NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup


Step 6: Create scheduled base backup cronjob

Create a new cronjob to create base backup for you:

(sudo -u postgres crontab -l 2>/dev/null; echo "0 2 * * * POSTGRES_VERSION=9.5 /usr/bin/envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-push /var/lib/postgresql/${POSTGRES_VERSION}/main") | sudo -u postgres crontab -

The above command will append a new cronjob that will be triggered daily at 2 am.

In the first step, we already setup S3 lifecycle, so we do not have to worry to delete the old backup files. It will be automatically deleted by AWS.

What’s next?

Do you need encryption for your data? wal-e can encrypt your data on the client side (on the database server) using gpg. You can follow the documentation.

In the next article, I will show you how to restore the wal-e backup.

Set up PostgreSQL 9.5 Master-Slave Replication using repmgr

Replication is a process to create redundant resources those are consistent each other. The goal of replication is either to improve reliability, accessibility or fault-tolerance. There are some ways to do database replication:

  • Master-Master (multi-master) replication: all nodes can execute read-write queries.
    More complex in terms of resolving conflicts, so if we do it eagerly, it will be slower.
    In PostgreSQL, there are some tools: BDR (asynchronous) and Bucardo (asynchronous).
  • Master-Slave replication: one master node for writing and all nodes can execute read queries.
    Simple, but if you have heavy write access, it will have bottlenecks in the master node.
    In PostgreSQL, there are some tools: WAL (Write-Ahead-Log, built-in PostgreSQL feature, full database replication), repmgr (wrapper for WAL and built-in streaming replication), Slony-I and pglogical (selective replication).

In this article, I will explain how to set up master-slave replication using repmgr. repmgr can create a full database replication and be a hot standby (or streaming replicated) server.

Here, I will set up a single master node and a single slave replication. The slave database can be used as read replica. I will use Ubuntu 14.04 and PostgreSQL 9.5. Some parts of the documentation here are taken from the repmgr documentation with some additional missing steps.

Step 1: Install PostgreSQL 9.5 and repmgr to master and slave servers

In order to get the latest update for PostgreSQL, we need to add PostgreSQL Apt Repository and then install the database.

echo "deb $(lsb_release -sc)-pgdg main" | sudo tee -a /etc/apt/sources.list.d/pgdg.list
wget --quiet -O - | sudo apt-key add -
sudo apt-get -y update
sudo apt-get -y install postgresql-${POSTGRES_VERSION} postgresql-${POSTGRES_VERSION}-repmgr

 Step 2: Allow nodes to SSH each other

In the next step, we will clone master database using repmgr. To do cloning, slave node must be able to SSH into its upstream (i.e. master) node. Here, we make postgres user in both server talk each other. You can use other method, but make sure each postgres user in all nodes can SSH each other.

In your local machine, create SSH key pairs:

mkdir -p /tmp/postgres_keypairs
cd /tmp/postgres_keypairs
ssh-keygen -f ./id_rsa -t rsa -N ''
cp authorized_keys
tar zcvf ../postgres_keypairs.tar.gz ./
scp ../postgres_keypairs.tar.gz ubuntu@repmgr_node1:~/
scp ../postgres_keypairs.tar.gz ubuntu@repmgr_node2:~/
cd ..
rm -r postgres_keypairs postgres_keypairs.tar.gz

Adjust scp commands so it will copy the key pairs into all your database cluster nodes, which is master and slave.

In the master node, run the following commands:

mkdir -p /tmp/postgres_keypairs
tar zxvf ~/postgres_keypairs.tar.gz -C /tmp/postgres_keypairs
sudo chown -R postgres:postgres /tmp/postgres_keypairs
sudo chmod go-rwx /tmp/postgres_keypairs/*
sudo -H -u postgres sh -c 'rm -rf ~/.ssh && mv /tmp/postgres_keypairs ~/.ssh'
rm ~/postgres_keypairs.tar.gz

In the slave node, do the same thing:

mkdir -p /tmp/postgres_keypairs
tar zxvf ~/postgres_keypairs.tar.gz -C /tmp/postgres_keypairs
sudo chown -R postgres:postgres /tmp/postgres_keypairs
sudo chmod go-rwx /tmp/postgres_keypairs/*
sudo -H -u postgres sh -c 'rm -rf ~/.ssh && mv /tmp/postgres_keypairs ~/.ssh'
rm ~/postgres_keypairs.tar.gz

In the master node, SSH to slave node using postgres user:

sudo -u postgres ssh -o StrictHostKeyChecking=no -A postgres@repmgr_node2 uptime

In the slave node, SSH to master node using postgres user:

sudo -u postgres ssh -o StrictHostKeyChecking=no -A postgres@repmgr_node1 uptime

Step 3: Configure Master – PostgreSQL configuration

As I said wrote before, repmgr basically is just a wrapper for built-in WAL and streaming replication. So, we need to activate both features by editing PostgreSQL configuration (it is in /etc/postgresql/9.5/main/postgresql.conf by default).

sudo -u postgres vim /etc/postgresql/${POSTGRES_VERSION}/main/postgresql.conf

Update postgresql.conf so it has the following configuration.

# Make PostgreSQL listen on all IP address.
# If you want to make it more secure, you can make your database listen to
# some specific IP addresses. But, make sure database cluster can connect
# to this server.
listen_addresses = '*'

# Ensure WAL files contain enough information to enable read-only queries
# on the standby
wal_level = 'hot_standby'

# Enable up to 10 replication connections
max_wal_senders = 10

# How much WAL to retain on the master to allow a temporarily
# disconnected standby to catch up again. The larger this is, the
# longer the standby can be disconnected. This is needed in
# prior to 9.4; from 9.4, replication slots can be used instead (see below).
# NOTE: A logfile segment might be take 16MB. Ensure you have enough
# free disk capacity by multiplying 16MB with this value.
wal_keep_segments = 100

# Enable read-only queries on a standby
# (Note: this will be ignored on a master but we recommend including
# it anyway)
hot_standby = on

# If archive_mode is enabled, check that 'archive_command' is non empty
# (however it's not practical to check that it actually represents a valid
# command).
# From PostgreSQL 9.5, archive_mode can be one of 'off', 'on' or 'always'
# so for ease of backwards compatibility, rather than explicitly check for an
# enabled mode, check that it's not &amp;amp;amp;amp;quot;off&amp;amp;amp;amp;quot;.
archive_mode = on

# Set archive command to a script or application that will safely store
# you WALs in a secure place. /bin/true is an example of a command that
# ignores archiving. Use something more sensible, like wal-e to backup
# the archieves.
archive_command = '/bin/true'

# Replication slots were introduced with PostgreSQL 9.4 and
# are designed to ensure that any standby connected to the master
# using a replication slot will always be able to retrieve
# the required WAL files.
# This removes the need to manually manage WAL file retention by estimating
# the number of WAL files that need to be maintained on the master
# using wal_keep_segments.
# NOTE: Enable only in version 9.4 and above. Do however be aware that
# if a standby is disconnected, WAL will continue to accumulate on
# the master until either the standby reconnects or the replication slot
# is dropped.
max_replication_slots = 5

Step 4: Configure Master – Create a replication user

Create a dedicated PostgreSQL superuser account and a database for the repmgr metadata, e.g.

sudo -u postgres createuser -s repmgr
sudo -u postgres createdb repmgr -O repmgr

Ensure the repmgr user has appropriate permissions in pg_hba.conf and can connect in replication mode.

sudo -u postgres vim /etc/postgresql/${POSTGRES_VERSION}/main/pg_hba.conf

Put the following lines to the top of pg_hba.conf:

local   replication   repmgr                              trust
host    replication   repmgr            trust
host    replication   repmgr          trust

local   repmgr        repmgr                              trust
host    repmgr        repmgr            trust
host    repmgr        repmgr          trust

Adjust according to your network environment and authentication requirements. Restart PostgreSQL server.

sudo service postgresql restart

Step 5: Configure Master – repmgr configuration

Create repmgr configuration file in /etc/repmgr.conf.

sudo vim /etc/repmgr.conf

Add the following configuration into repmgr.conf:

# An arbitrary name for the replication cluster;
# This must be identical on all nodes
cluster = 'test'

# A unique integer identifying the node
node = 1

# A unique string identifying the node;
# Avoid names indicating the current replication role like 'master' or 'standby'
# as the server's role could change.
node_name = 'node1'

# A valid connection string for the repmgr database on the current server.
conninfo = 'host=repmgr_node1 user=repmgr dbname=repmgr'

# Use replication slot if you enable replication slots in
# PostgreSQL configuration.
use_replication_slots = 1

repmgr will create a schema named after the cluster and prefixed with repmgr_, e.g. repmgr_test; we also recommend that you set the repmgr user’s search path to include this schema name, e.g.

ALTER USER repmgr SET search_path TO repmgr_test, "$user", public;

Step 6: Configure Master – Initialise the master server

To enable repmgr to support a replication cluster, the master node must be registered with repmgr, which creates therepmgr database and adds a metadata record for the server:

sudo -u postgres repmgr -f /etc/repmgr.conf master register

Run the following queries to check the initialize process result.

SELECT * FROM repmgr_test.repl_nodes;

The metadata record looks like this:

 id |  type  | upstream_node_id | cluster | name  |                   conninfo                  |   slot_name   | priority | active 
  1 | master |                  | test    | node1 | host=repmgr_node1 user=repmgr dbname=repmgr | repmgr_slot_1 |      100 | t
(1 row)

Each server in the replication cluster will have its own record and will be updated when its status or role changes.

Step 7: Configure Slave – repmgr configuration

Create repmgr configuration file in /etc/repmgr.conf for slave.

sudo vim /etc/repmgr.conf

The configuration is slightly same as the master node. Add the following configuration into repmgr.conf:

cluster = 'test'
node = 2
node_name = 'node2'
conninfo = 'host=repmgr_node2 user=repmgr dbname=repmgr'
use_replication_slots = 1

 Step 8: Configure Slave – clone master database

We need to stop PostgreSQL server instance in the slave node and remove PostgreSQL data directory. After that, we can clone master database into the slave.

sudo service postgresql stop
sudo -u postgres rm -r /var/lib/postgresql/${POSTGRES_VERSION}/main
sudo -u postgres repmgr -h repmgr_node1 -U repmgr -d repmgr -D /var/lib/postgresql/${POSTGRES_VERSION}/main -f /etc/repmgr.conf standby clone
sudo mv /var/lib/postgresql/${POSTGRES_VERSION}/main/pg_hba.conf /etc/postgresql/${POSTGRES_VERSION}/main/pg_hba.conf
sudo mv /var/lib/postgresql/${POSTGRES_VERSION}/main/postgresql.conf /etc/postgresql/${POSTGRES_VERSION}/main/postgresql.conf 
sudo mv /var/lib/postgresql/${POSTGRES_VERSION}/main/pg_ident.conf /etc/postgresql/${POSTGRES_VERSION}/main/pg_ident.conf
sudo service postgresql start

Adjust repmgr_node1 into master’s IP address/domain name in the last command. If we run the last command, we will see log output something like:

[2016-04-28 10:28:51] [NOTICE] destination directory '/var/lib/postgresql/9.5/main' provided
[2016-04-28 10:28:51] [NOTICE] starting backup (using pg_basebackup)...
[2016-04-28 10:28:51] [HINT] this may take some time; consider using the -c/--fast-checkpoint option
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
[2016-04-28 10:28:53] [NOTICE] copying configuration files from master
receiving incremental file list
 23,275 100% 22.20MB/s 0:00:00 (xfr#1, to-chk=0/1)
receiving incremental file list
 5,027 100% 4.79MB/s 0:00:00 (xfr#1, to-chk=0/1)
receiving incremental file list
 1,636 100% 1.56MB/s 0:00:00 (xfr#1, to-chk=0/1)
[2016-04-28 10:28:55] [NOTICE] standby clone (using pg_basebackup) complete
[2016-04-28 10:28:55] [NOTICE] you can now start your PostgreSQL server
[2016-04-28 10:28:55] [HINT] for example : pg_ctl -D /var/lib/postgresql/9.5/main start
[2016-04-28 10:28:55] [HINT] After starting the server, you need to register this standby with "repmgr standby register"

This will clone the PostgreSQL data directory files from the master at repmgr_node1 using PostgreSQL’s pg_basebackup utility. A recovery.conf file containing the correct parameters to start streaming from this master server will be created automatically, and unless otherwise the postgresql.conf and pg_hba.conf files will be copied from the master.

Step 9: Verify replication is running on master node

We need to ensure that our setup is correct, which is the replication is running properly. Run the following SQL query in the master database.

repmgr=# SELECT * FROM pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 7455
usesysid         | 16712
usename          | repmgr
application_name | node2
client_addr      |
client_hostname  | 
client_port      | 58687
backend_start    | 2016-04-28 10:03:55.103915-04
backend_xmin     | 
state            | streaming
sent_location    | 0/15000288
write_location   | 0/15000288
flush_location   | 0/15000288
replay_location  | 0/15000288
sync_priority    | 0
sync_state       | async

Step 10: Register Slave node

Register the slave node as standby server with:

sudo -u postgres repmgr -f /etc/repmgr.conf standby register
[2016-04-28 10:40:27] [NOTICE] standby node correctly registered for cluster test with id 2 (conninfo: host=repmgr_node2 user=repmgr dbname=repmgr)

Connect to the slave node’s repmgr database and check the repl_nodes table:

repmgr=# SELECT * FROM repmgr_test.repl_nodes ORDER BY id;
 id |  type   | upstream_node_id | cluster | name  |                   conninfo                  |   slot_name   | priority | active 
  1 | master  |                  | test    | node1 | host=repmgr_node1 user=repmgr dbname=repmgr | repmgr_slot_1 |      100 | t
  2 | standby |                1 | test    | node2 | host=repmgr_node2 user=repmgr dbname=repmgr | repmgr_slot_2 |      100 | t
(2 rows)

The standby server now has a copy of the records for all servers in the replication cluster. Note that the relationship between master and standby is explicitly defined via the upstream_node_id value, which shows here that the standby’s upstream server is the replication cluster master.

Now we already have a master-slave replication. Read more about repmgr in the Github repository ( Things you can do using repmgr:

In the next article series, I will tell you how to set up a scheduled binary backup using wal-e.

Install PostgreSQL, PostGIS and OpenGeo on Ubuntu 14.04

Actually to install PostgreSQL, PostGIS and OpenGeo we can follow

wget -qO- | sudo apt-key add -
echo "deb trusty main" | sudo tee -a /etc/apt/sources.list.d/opengeo.list
sudo apt-get update
sudo apt-get install opengeo

Fixing libgdal1 Error

Unfortunately, I got an error:

Reading package lists... Done
Building dependency tree 
Reading state information... Done
The following packages were automatically installed and are no longer required:
 libarmadillo4 libarpack2 libbcmail-java libbcpkix-java libboost-thread1.54.0
 libcgal10 libcommons-jexl-java libcommons-lang3-java libdap11 libdapclient3
 libepsilon1 libexcalibur-logger-java libfreexl1 libgdal1h
 libgeronimo-javamail-1.4-spec-java libgeronimo-jms-1.1-spec-java
 libhdf4-0-alt libhdf5-7 libhtmlparser-java libhttpmime-java libjcharts-java
 libjtidy-java libkml0 liblwgeom-2.1.2 liblwgeom-2.1.6 libnetcdfc7 libodbc1
 libogdi3.2 libossp-uuid16 libspatialite5 liburiparser1 libxerces-c3.1
 libxstream-java linux-headers-3.13.0-32 linux-headers-3.13.0-32-generic
 linux-image-3.13.0-32-generic linux-image-extra-3.13.0-32-generic
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
 libgdal opengeo-client opengeo-dashboard opengeo-docs opengeo-server
 opengeo-tomcat7 pgdg-keyring postgis-2.1 postgresql-9.3
 postgresql-9.3-postgis-2.1 postgresql-client-9.3 postgresql-client-common
Suggested packages:
 oidentd ident-server locales-all postgresql-doc-9.3
The following NEW packages will be installed:
 libgdal opengeo opengeo-client opengeo-dashboard opengeo-docs opengeo-server
 opengeo-tomcat7 pgdg-keyring postgis-2.1 postgresql-9.3
 postgresql-9.3-postgis-2.1 postgresql-client-9.3 postgresql-client-common
0 upgraded, 14 newly installed, 0 to remove and 109 not upgraded.
Need to get 82.6 MB of archives.
After this operation, 222 MB of additional disk space will be used.
Preparing to unpack .../libgdal_1.11.1-1+opengeo_amd64.deb ...
Unpacking libgdal (1.11.1-1+opengeo) ...
dpkg: error processing archive /var/cache/apt/archives/libgdal_1.11.1-1+opengeo_amd64.deb (--unpack):
 trying to overwrite '/usr/lib/', which is also in package libgdal1h 1.10.1+dfsg-5ubuntu1
Selecting previously unselected package postgis-2.1.
Errors were encountered while processing:
E: Sub-process /usr/bin/dpkg returned an error code (1)

The error caused by libgdal1 has duplicate item with different name libgdal1h. So, to fix the error, we can run the following commands:

sudo dpkg -P libgdal1h
sudo apt-get -f install

Voila! Our PostgreSQL, PostGIS and OpenGeo are ready on our machine.

Bonus: Activate PostGIS Shapefile and DBF loader

To access PostGIS Shapefile and DBF loader menu in pgAdmin III, we can edit /usr/share/pgadmin3/plugins.d/plugins.ini file by adding the following lines:

; pgShapeLoader (Linux):
Title=PostGIS Shapefile and DBF loader
Command=$$PGBINDIR/shp2pgsql-gui -U $$USERNAME -d $$DATABASE -p $$PORT -h $$HOSTNAME
Description=Open a PostGIS ESRI Shapefile or Plain dbf loader console to the current database.

Tutorial Creating OpenStack Instance in TryStack

OpenStack is an open-source software cloud computing platform. OpenStack is primarily used for deploying an infrastructure as a service (IaaS) solution like Amazon Web Service (AWS). In other words, you can make your own AWS by using OpenStack. If you want to try out OpenStack, TryStack is the easiest and free way to do it.

In order to try OpenStack in TryStack, you must register yourself by joining TryStack Facebook Group. The acceptance of group needs a couple days because it’s approved manually. After you have been accepted in the TryStack Group, you can log in TryStack.

Screenshot from 2016-05-12 20:04:39 Homepage

I assume that you already join to the Facebook Group and login to the dashboard. After you log in to the TryStack, you will see the Compute Dashboard like:

Screenshot from 2016-05-12 20:09:40

OpenStack Compute Dashboard

Overview: What we will do?

In this post, I will show you how to run an OpenStack instance. The instance will be accessible through the internet (have a public IP address). The final topology will like:


Screenshot from 2016-05-12 19:15:55
Network topology

As you see from the image above, the instance will be connected to a local network and the local network will be connected to internet.


Step 1: Create Network

Network? Yes, the network in here is our own local network. So, your instances will be not mixed up with the others. You can imagine this as your own LAN (Local Area Network) in the cloud.

  1. Go to Network > Networks and then click Create Network.
  2. In Network tab, fill Network Name for example internal and then click Next.
  3. In Subnet tab,
    1. Fill Network Address with appropriate CIDR, for example Use private network CIDR block as the best practice.
    2. Select IP Version with appropriate IP version, in this case IPv4.
    3. Click Next.
  4. In Subnet Details tab, fill DNS Name Servers with (Google DNS) and then click Create.

Step 2: Create Instance

Now, we will create an instance. The instance is a virtual machine in the cloud, like AWS EC2. You need the instance to connect to the network that we just created in the previous step.

  1. Go to Compute > Instances and then click Launch Instance.
  2. In Details tab,
    1. Fill Instance Name, for example Ubuntu 1.
    2. Select Flavor, for example m1.medium.
    3. Fill Instance Count with 1.
    4. Select Instance Boot Source with Boot from Image.
    5. Select Image Name with Ubuntu 14.04 amd64 (243.7 MB) if you want install Ubuntu 14.04 in your virtual machine.
  3. In Access & Security tab,
    1. Click [+] button of Key Pair to import key pair. This key pair is a public and private key that we will use to connect to the instance from our machine.
    2. In Import Key Pair dialog,
      1. Fill Key Pair Name with your machine name (for example Edward-Key).
      2. Fill Public Key with your SSH public key (usually is in ~/.ssh/ See description in Import Key Pair dialog box for more information. If you are using Windows, you can use Puttygen to generate key pair.
      3. Click Import key pair.
    3. In Security Groups, mark/check default.
  4. In Networking tab,
    1. In Selected Networks, select network that have been created in Step 1, for example internal.
  5. Click Launch.
  6. If you want to create multiple instances, you can repeat step 1-5. I created one more instance with instance name Ubuntu 2.

Step 3: Create Router

I guess you already know what router is. In the step 1, we created our network, but it is isolated. It doesn’t connect to the internet. To make our network has an internet connection, we need a router that running as the gateway to the internet.

  1. Go to Network > Routers and then click Create Router.
  2. Fill Router Name for example router1 and then click Create router.
  3. Click on your router name link, for example router1, Router Details page.
  4. Click Set Gateway button in upper right:
    1. Select External networks with external.
    2. Then OK.
  5. Click Add Interface button.
    1. Select Subnet with the network that you have been created in Step 1.
    2. Click Add interface.
  6. Go to Network > Network Topology. You will see the network topology. In the example, there are two network, i.e. external and internal, those are bridged by a router. There are instances those are joined to internal network.

Step 4: Configure Floating IP Address

Floating IP address is public IP address. It makes your instance is accessible from the internet. When you launch your instance, the instance will have a private network IP, but no public IP. In OpenStack, the public IPs is collected in a pool and managed by admin (in our case is TryStack). You need to request a public (floating) IP address to be assigned to your instance.

  1. Go to Compute > Instance.
  2. In one of your instances, click More > Associate Floating IP.
  3. In IP Address, click Plus [+].
  4. Select Pool to external and then click Allocate IP.
  5. Click Associate.
  6. Now you will get a public IP, e.g., for your instance.

Step 5: Configure Access & Security

OpenStack has a feature like a firewall. It can whitelist/blacklist your in/out connection. It is called Security Group.

  1. Go to Compute > Access & Security and then open Security Groups tab.
  2. In default row, click Manage Rules.
  3. Click Add Rule, choose ALL ICMP rule to enable ping into your instance, and then click Add.
  4. Click Add Rule, choose HTTP rule to open HTTP port (port 80), and then click Add.
  5. Click Add Rule, choose SSH rule to open SSH port (port 22), and then click Add.
  6. You can open other ports by creating new rules.

Step 6: SSH to Your Instance

Now, you can SSH your instances to the floating IP address that you got in the step 4. If you are using Ubuntu image, the SSH user will be ubuntu.

[Linux] Membuat dan Menerapkan Patch Kernel

Patch pada dasarnya merupakan sebuah file yang bertujuan untuk membandingkan dan mengubah dua benda yang berbeda agar menjadi serupa. Hal ini juga dilakukan pada kernel Linux. Ketika kernel terdapat (sedikit) perubahan, maka Anda cukup mengirimkan patch dari kernel yang ada, dengan demikian tidak perlu meng-upload keseluruhan kernel. File patch umumnya relatif kecil dibandingkan tarball keseluruhan kernel.

Membuat Patch Kernel

  1. Pastikan Anda memiliki source kernel yang masih belum dimodifikasi (vanilla kernel).
    Misalkan Anda mempunyai vanilla kernel pada direktori ~/kernel/linux-, lalu masuk ke direktori ~/kernel.

    $ cd ~/kernel
  2. Jika Anda belum melakukan modifikasi, lakukan penggandaan kernel vanilla tersebut, misalnya dengan nama linux-
    $ cp -rf linux- linux-
  3. Pastikan source kernel yang dimodifikasi telah bersih dari file-file temporer yang tidak diperlukan dalam perbandingan dengan source kernel vanilla.
    Beberapa editor teks (termasuk gedit) menambahkan berkas back-up dengan nama file yang diakhiri dengan ~. Untuk menghapusnya,

    $ rm `find . –name *~`
  4. Lakukan perbandingan antara dua direktori sekaligus membuat patch file-nya.
    $ diff -ruN linux- linux- > patch-linux-

    Perintah di atas akan membandingkan linux- dengan linux- dengan linux- sebagai kode sumber yang asli.
    Bagian patch-linux- merupakan file output dari perbandingan yang sekaligus merupakan patch file.

Menerapkan Patch Kernel

  1. Misalkan Anda memodifikasi dari linux- menjadi linux- Anda dapat menduplikasi linux- terlebih dahulu, misalnya menjadi linux-
    $ cp -rf linux- linux-
  2. Lakukan patching ke linux-, misalkan patch file-nya adalah patch-linux-
    $ patch -p1 -d linux- < patch-linux-
  3. Direktori linux- akan sama dengan linux- Anda tinggal melakukan kompilasi kernel terhadap patch tersebut.