Restore databases in AWS RDS

Home
Backups

Warning This document has not been updated for a while now. It may be out of date.

Last updated: 10 Apr 2024

This playbook describes how to restore a database instance using Amazon’s RDS Backups feature.

We use RDS Backups to give us full nightly backups and point-in-time recovery (PITR) (also known as continuous data protection or CDP).

We also generate automated backups using a backups lambda. These are stored in an s3 bucket per environment but do not include a snapshot history.

Restore an RDS instance via the AWS CLI

This documentation will illustrate how to restore a database (DB) instance from a DB Snapshot with AWS CLI.

Before you get started you need to know:

The environment in which you are restoring the database - replace throughout the scripts
The name of the database which needs to be restored - if you are restoring multiple databases, you will need to carry out these steps again for it

For more information, read the AWS documentation on Restoring from a DB Snapshot.

1. Retrieve the relevant database information

In this example, we’re using describe-db-instances to identify the instances we want a snapshot for.

# Find the database you want to find snapshots for (easily identified by its name)
aws rds describe-db-instances | jq '.DBInstances | .[] | {DBInstanceIdentifier, DBName}'
DATABASE_ID="<replace_with_previous_output>"

Find and export the relevant VPC and Security Group configuration for your RDS restore

aws rds describe-db-instances \
 --db-instance-identifier terraform-20230623123439228000000001 \
 --query 'DBInstances[].[VpcSecurityGroups[].VpcSecurityGroupId,DBParameterGroups[].DBParameterGroupName,DBSubnetGroup.DBSubnetGroupName]'

Example of the output:

vpc-security-group-id = sg-XXXXXXXX
db-parameter-group-name = local-links-manager-postgres-XXXXXXXXXX
db-subnet-group-name = blue-govuk-rds-subnet

Now export the result:

DB_SUBNET_GROUP_NAME="<replace_with_previous_output>"
VPC_SECURITY_GROUP_ID="<replace_with_previous_output>" # A comma-separated list of sg ids
DB_PARAMETER_GROUP_NAME="<replace_with_previous_output>"

2. Retrieve a list of all snapshot ARNs for your database name

aws rds describe-db-snapshots | jq '.DBSnapshots | .[] | select(.DBInstanceIdentifier = "$DATABASE_ID") | {DBInstanceIdentifier, DBSnapshotIdentifier}'

# Decide which snapshot you want and set its identifier below
SNAPSHOT_IDENTIFIER="<replace_with_previous_output>"

3. Restore the database instance from a snapshot

The restored database must have the same security groups and be in the same VPC (that’s the “subnet group name” parameter) as the original one, otherwise, apps won’t be able to connect to it. Therefore the database needs to be restored in the same VPC and with the same security groups as the original instance the snapshot came from.

Using the stored variables from the previous steps:

aws rds restore-db-instance-from-db-snapshot \
  --db-subnet-group-name $DB_SUBNET_GROUP_NAME \
  --db-instance-identifier restored-$DATABASE_ID \
  --db-snapshot-identifier $SNAPSHOT_ARN \
  --vpc-security-group-ids $VPC_SECURITY_GROUP_ID

To see the newly created database instance, log into AWS Console > RDS > Databases > filter for your database name. You should see the original and newly created one.

4. Test the database has been fully restored

Before moving on to the next step we need to ensure that the database has been fully restored and is ready to be used:

aws rds wait db-instance-available --db-instance-identifier restored-${DATABASE_ID}

This command will wait until the database is ready, and then exit without any output.

5. Get the new database’s hostname

Make a note of the new endpoint address:

aws rds describe-db-instances \
  --db-instance-identifier "restored-${DATABASE_ID}" \
  --query 'DBInstances[].Endpoint.Address'

6. Update the existing secrets manager secret value

This requires updating the existing secrets manager secret for the database you’ve just restored

Log in to AWS in the correct environment: development, staging or production
In AWS Secrets Manager, search for and click on the relevant secret
Under the “Overview” tab, in the “Secret Value” section, select “Retrieve Secret Value”.
Make a note of the existing value, in case you need to revert the changes (for example, if performing a drill).
Click “Edit”, and replace the value of the host and dbInstanceIdentifier fields with the URL and identifier of the new database instance. Click “Save”.

7. Redeploy the affected ECS applications

The execution role in ECS passes secret values to a new revision of the application. This process is triggered by a standard deployment using terraform with a new docker image.

Open an empty PR you want to cut a release for
Seek approval to merge this PR (for staging and production releases)
Manually gated production releases will need to be approved after the staging workflow has completed

You’ll want to keep an eye on the #tariff-alerts channel and validate the application is still running using your usual process.