AWS practical guide
Periodically back up your MongoDB on Amazon S3
Schedule task on your Windows machine running MongoDB to back up to S3
This article aims to convey how simple it is to use AWS Tools for PowerShell and Amazon S3 as a place to back up your data. Also, note that while I mention Windows machine in the diagram below, this setup is not restricted to only Windows. You can also have MongoDB running on Linux machine. For Linux, just replace windows scheduled task with a CRON job. PowerShell can also be run on Linux.
The way this works is basically a simple 2-step process:
- Set your backup schedule using Windows Task scheduler to run a PowerShell script
- Write up a PowerShell script. This script will need to first import AWS Tools for Powershell, then do a MongoDB dump, compress it, then upload the compressed backup to Amazon S3
To get this running, you will need to prepare a few things on your Database box and also on AWS side.
Let us take care of the AWS side first. You will need two things:
First: An S3 bucket.
If you have an existing bucket that you want to use, then use that. Otherwise go to S3 and create a new bucket. I am going to assume that you are going to create a new bucket. one way to do this is to log in to your AWS Management Console, go to S3 and create a bucket.
Second: An IAM user with access key.
If you have an existing user that you want to use, it’s fine to go with that. Otherwise, go to IAM and create a new user. I am going to assume that you will be creating a new user via the AWS Management Console. In the IAM create user screen, there is an option to let it generate access key at the end so you don’t have to do this after the user creation, just to save a tiny amount of time. To do this, make sure you tick the programmatic access checkbox on the IAM user creation screen like so:
On the next screen we will need to give this user permission to upload to the S3 bucket. This is just one way of doing it for the sake of simplicity and demonstration. So, on the second screen of the user creation wizard, we will create a new IAM policy to allow write to the S3 bucket that was created earlier and associate it with this user. So, in the second screen of the IAM user creation wizard, choose Attach existing policies directly and click on Create policy button.
A policy creation wizard will open in new tab.
Here, you can either use the visual editor to craft your policy, or if you would like to save some time, simply switch to JSON and paste in this code.
This IAM policy allows
s3:PutObject action on just the
mymongodbbackupdemo S3 bucket. So if you name your s3 bucket differently, replace the ARN there with the ARN of your bucket. Click on the next button and proceed through to create the policy.
When the policy has been created, you should now be able to select it in the IAM user creation wizard. Proceed with IAM user creation wizard. When the user has been created, you will see the access key ID and secret access key displayed. Do not share this with anyone, and remember that this will only be displayed once, so make sure that you save it somewhere safe. You will need this when setting up the script on your database box later.
In case you lose it, simply go back to IAM, find your user, go to the Security Credentials tab and create a new access key. Make sure that you make the existing ones inactive.
Next, on your Database machine, you need to:
- Ensure that you have PowerShell. If you are running a Windows machine, you are probably fine as PowerShell is included by default, that is, unless you are operating a dinosaur version of Windows. If you are running Linux, then follow through this instructions from Microsoft. Remember to follow it through to the end and don’t forget to set the execution policy. Otherwise your PowerShell script won’t execute at all.
- Install AWS Tools for PowerShell. If you are on Windows, then follow through this instructions from AWS. For Linux, go here.
Piece them together
Okay, now we have done all the prep work. We just need a PowerShell script to piece this together, and set up a task to automatically run the script at on a set schedule.
So, fire up PowerShell IDE, or simply create a new .ps1 file and put in the code below in the script, then save it somewhere local in your DB machine. We’ll be using mongodump utility to export the contents of our database. Of course, pay attention to the paths and parameters, and make them match the paths you have on your machine. Also take note of the AWS parameters, like bucket name and region.
Import-Module “C:\Program Files (x86)\AWS Tools\PowerShell\AWSPowerShell\AWSPowerShell.psd1”
$database = “MyMongoDBName”
$mongohost = “localhost”
$backupPath = ‘C:\Data\MongoRemBackup’
$timestamp = get-date -format yyyyMMddHHmmss
$archivePath = ‘C:\Data\MongoRemBackup\mymongo-’+$timestamp+’.gz’$s3Bucket = ‘mymongodbbackupdemo’
$region = ‘ap-southeast-1’
$accessKey = ‘[Just for demonstration purpose]’
$secretKey = ‘[Just for demonstration purpose]’& “C:\Program Files\MongoDB\Server\3.2\bin\mongodump.exe” — db $database — archive=”$archivePath” — gzipWrite-S3Object -BucketName $s3Bucket -File $archivePath -Key $archiveFileName -Region $region -AccessKey $accessKey -SecretKey $secretKey
The last line there is the one that uploads the mongoDB backup, which is treated just like any other Blob, to S3 bucket.
The last step is to set up a scheduled task to run the powershell script at the schedule or interval that you want. There are plenty of articles available that helps you to do it, here is a good one. If you don’t want to read it up, then tweak the following command and change the script path and file name to match where you saved your powershell script in, and run it on command line:
SCHTASKS /CREATE /SC DAILY /TN "MongoDB Backup and Upload to S3" /TR "C:\Scripts\mongoDbS3Backup.ps1" /ST 01:00
The command line above sets the powershell script to run every day at 1AM.
So now we an automated, scheduled back up of your data that you can set and forget. However, what I have shown you here need not essentially be an architecture that I would recommend.
If we have an on-premise MongoDB, I would say, really consider migrating this onto cloud-native noSQL database like Amazon DynamoDB. We get a fully managed noSQL on AWS with this. Migration can be done using the AWS Database Migration Service (DMS). Check this article out.
Once you have your database in the Cloud, perform backups if needed in the Cloud, this article gives a good example of how to do so using serverless services.