Skip to main content

How to restore the database

Introduction

It may be necessary on occasion to restore the database, for example if the database has become corrupted or following an incident. This document outlines the steps needed to perform a database restore.

Accessing the database

Please note Database access will be provided to UKSA and/or suppliers offline as required on a principle of least privilege basis.
  • Access the database using GOV.UK PaaS and your GOV.UK PaaS credentials
  • Run cf env msh-spacetrack-dev​ (change app name as appropriate) to output the environment variables
  • From VCAP_SERVICES read the Postgres​ section for "credentials"​and use to access the database

Restore a database using RDS instance snapshot

It’s also possible to use a different method to restore a copy of the database without need of creating a dump. This is more convienient way to restore after an incident. The detailed description is covered in GOV.UK PaaS Documentation.

It’s been tested on Monitor Space Hazards database/environment and the whole restore process takes about 15 minutes instead of hours, when dumping the database using pg_dump. Restoring the database using snapshots has some limitations (cited from GOV.UK PaaS documentation):

  • Can only restore the most recent snapshot from the latest nightly backup
  • Cannot restore from a service instance that has been deleted
  • Must use the same service plan for the copy as for the original service instance
  • Must create the new service instance in the same organisation and space as the original. This is to prevent unauthorised access to data between spaces. If you need to copy data to a different organisation and/or space, you can connect to your PostgreSQL instance from a local machine using Conduit.

Prerequisites

  • Postgres database installed on target machine
  • Access to pg_restore tool
  • Access to command line tool
  • Python 3.x installed
  • Copy of Monitor Space Hazards Python backend from Github (needed for alembic command and for restoring the database schema)
  • Installed all the Python requirements.txt tools (needed for alembic command)
  • Optional: virtual environment created and set via source venv/bin/activate

The database is archived using the Postgres pg_dump tool. It is archived into a directory in compressed format.

Please note The database is archived with data only (`–data-only` switch in `pg_dump`). It is not archived with the database schema as we are using the Alembic migration tool which builds and modifies the schema prior to the restore.

Restore process

To restore the database into a new location, the following steps need to be completed:

  • run psql in command prompt
  • setup a clean database named “postgres” with command CREATE DATABASE postgres; in psql
  • run alembic upgrade head in command line from the main application folder. This will create the database structure

    Warning main application folder is the main folder of the cloned Github repository where subfolder migrations/ resides

  • run pg_restore -d postgres -p 5444 -F d path_to_folder_with_db_backup. The port number (5444) may be different in your case.

The whole process can take a couple of hours to finish, depending on the amount of data in the archived database.

This page was last reviewed on 6 April 2022. It needs to be reviewed again on 6 July 2022 .
This page was set to be reviewed before 6 July 2022. This might mean the content is out of date.