Lesson 17: Backup and Recovery Strategies
Welcome to Lesson 17! Now that you've learned about MongoDB administration, security, and maintenance, it's time to master one of the most critical aspects of database management: backup and recovery. In this lesson, you'll learn various strategies to protect your MongoDB data against loss and how to restore it when needed.
Learning Goals:
- Understand different MongoDB backup methods
- Implement mongodump and mongorestore for logical backups
- Configure filesystem snapshots for physical backups
- Set up point-in-time recovery with oplog
- Develop a comprehensive backup strategy
Understanding Backup Types
MongoDB supports two main types of backups: logical backups and physical backups.
Logical backups (mongodump/mongorestore) export data in JSON/BSON format, providing flexibility but potentially slower for large datasets.
Physical backups (filesystem snapshots) copy the actual database files, offering faster backup/restore times but requiring database consistency.
For production systems, combine both approaches: use filesystem snapshots for regular backups and logical backups for selective data recovery.
Logical Backups with mongodump
The mongodump tool creates binary exports of your database content. Let's explore its practical usage.
Basic mongodump Usage
# Backup entire MongoDB instance
mongodump --uri="mongodb://localhost:27017" --out=/backup/$(date +%Y%m%d)
# Backup specific database
mongodump --uri="mongodb://localhost:27017/mydb" --out=/backup/mydb_backup
# Backup specific collection
mongodump --uri="mongodb://localhost:27017/mydb" --collection=users --out=/backup/selective
Advanced mongodump Options
# Backup with authentication
mongodump --uri="mongodb://admin:password@localhost:27017" --authenticationDatabase=admin
# Backup with query filter (only specific documents)
mongodump --uri="mongodb://localhost:27017/mydb" --query='{"status": "active"}' --collection=users
# Compress backup output
mongodump --uri="mongodb://localhost:27017" --gzip --out=/backup/compressed
# Backup from remote Atlas cluster
mongodump --uri="mongodb+srv://username:password@cluster.mongodb.net/mydb" --out=/backup/atlas
Restoring Data with mongorestore
The mongorestore tool imports data created by mongodump. Here's how to use it effectively.
Basic Restoration
# Restore entire backup
mongorestore --uri="mongodb://localhost:27017" /backup/20231201
# Restore specific database
mongorestore --uri="mongodb://localhost:27017" --db=mydb /backup/20231201/mydb
# Restore with drop (remove existing data first)
mongorestore --uri="mongodb://localhost:27017" --drop /backup/20231201
# Restore specific collection
mongorestore --uri="mongodb://localhost:27017/mydb" --collection=users /backup/20231201/mydb/users.bson
Selective and Partial Restoration
# Restore excluding specific collections
mongorestore --uri="mongodb://localhost:27017" --excludeCollection=logs --excludeCollection=temp_data /backup/20231201
# Restore with namespace mapping (change database name)
mongorestore --uri="mongodb://localhost:27017" --nsFrom="olddb.*" --nsTo="newdb.*" /backup/20231201
# Restore compressed backup
mongorestore --uri="mongodb://localhost:27017" --gzip /backup/compressed
Filesystem Snapshots
For large databases, filesystem snapshots provide faster backup and restore operations. The approach varies by filesystem and deployment type.
LVM Snapshots (Linux)
#!/bin/bash
# Create LVM snapshot for MongoDB backup
DB_PATH="/var/lib/mongodb"
SNAPSHOT_SIZE="5G"
SNAPSHOT_NAME="mongodb-snapshot-$(date +%Y%m%d)"
BACKUP_DIR="/backup/mongodb"
# Connect to MongoDB and flush writes
mongosh --eval "db.fsyncLock()"
# Create LVM snapshot
lvcreate --size $SNAPSHOT_SIZE --snapshot --name $SNAPSHOT_NAME /dev/vg0/mongodb
# Unlock MongoDB writes
mongosh --eval "db.fsyncUnlock()"
# Mount snapshot and copy files
mkdir -p /mnt/snapshot
mount /dev/vg0/$SNAPSHOT_NAME /mnt/snapshot
rsync -av /mnt/snapshot/ $BACKUP_DIR/$SNAPSHOT_NAME/
# Cleanup
umount /mnt/snapshot
lvremove -f /dev/vg0/$SNAPSHOT_NAME
EBS Snapshots (AWS)
#!/bin/bash
# Script for EC2 instances with EBS volumes
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
VOLUME_ID=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID \
--query 'Reservations[0].Instances[0].BlockDeviceMappings[?DeviceName==`/dev/xvdf`].Ebs.VolumeId' \
--output text)
# Flush MongoDB writes
mongosh --eval "db.fsyncLock()"
# Create EBS snapshot
SNAPSHOT_ID=$(aws ec2 create-snapshot \
--volume-id $VOLUME_ID \
--description "MongoDB backup $(date)" \
--query 'SnapshotId' --output text)
# Wait for snapshot completion
aws ec2 wait snapshot-completed --snapshot-ids $SNAPSHOT_ID
# Unlock MongoDB
mongosh --eval "db.fsyncUnlock()"
echo "Snapshot created: $SNAPSHOT_ID"
Always use db.fsyncLock() and db.fsyncUnlock() around filesystem snapshots to ensure data consistency. Without proper locking, you risk capturing inconsistent database state.
Point-in-Time Recovery with Oplog
MongoDB's oplog enables point-in-time recovery, allowing you to restore to any moment between backups.
Understanding Oplog
The oplog (operations log) is a special capped collection that records all data modifications. To use point-in-time recovery:
- Create a base backup with mongodump
- Include the oplog with
--oplogflag - Use the oplog to replay operations up to a specific timestamp
Oplog Backup and Recovery
# Create base backup with oplog
mongodump --uri="mongodb://localhost:27017" --oplog --out=/backup/base_with_oplog
# Restore to specific point in time
mongorestore --uri="mongodb://localhost:27017" \
--oplogReplay \
--oplogLimit "1672531200" \
/backup/base_with_oplog
Automated Backup Strategies
Let's create comprehensive backup scripts for different scenarios.
- Daily Backup Script
- Weekly Full Backup
#!/bin/bash
BACKUP_DIR="/backup/mongodb"
DATE=$(date +%Y%m%d)
RETENTION_DAYS=7
# Create daily backup with oplog
mongodump --uri="mongodb://localhost:27017" \
--oplog \
--gzip \
--out="$BACKUP_DIR/daily_$DATE"
# Clean up old backups
find "$BACKUP_DIR" -name "daily_*" -type d -mtime +$RETENTION_DAYS -exec rm -rf {} \;
echo "Daily backup completed: $BACKUP_DIR/daily_$DATE"
#!/bin/bash
BACKUP_DIR="/backup/mongodb"
DATE=$(date +%Y%m%d)
RETENTION_WEEKS=4
# Stop balancer for consistent backup
mongosh --eval "sh.stopBalancer()"
# Full backup of all databases
mongodump --uri="mongodb://localhost:27017" \
--gzip \
--out="$BACKUP_DIR/full_$DATE"
# Restart balancer
mongosh --eval "sh.startBalancer()"
# Clean up old full backups
find "$BACKUP_DIR" -name "full_*" -type d -mtime +$((RETENTION_WEEKS * 7)) -exec rm -rf {} \;
echo "Weekly full backup completed: $BACKUP_DIR/full_$DATE"
MongoDB Atlas Backups
If you're using MongoDB Atlas, backup management is simplified but understanding the options is crucial.
// Connect to Atlas and check backup status
const backupStatus = db.getSiblingDB("admin").runCommand({
listBackups: 1
});
printjson(backupStatus);
// Restore from Atlas snapshot (via UI or API)
// Atlas provides automated snapshots and point-in-time recovery
MongoDB Atlas automatically creates daily snapshots and retains them according to your cluster tier. You can also trigger on-demand snapshots and download them for local storage.
Testing Your Recovery Strategy
A backup is only useful if you can successfully restore from it. Regular recovery testing is essential.
#!/bin/bash
# Test recovery on isolated environment
TEST_DB="recovery_test"
BACKUP_FILE="/backup/latest_backup"
# Create test database and populate with sample data
mongosh --eval "
use $TEST_DB;
db.test.insertMany([
{name: 'test1', value: 100},
{name: 'test2', value: 200}
]);
"
# Take backup of test data
mongodump --db=$TEST_DB --out=$BACKUP_FILE
# Corrupt test data
mongosh --eval "use $TEST_DB; db.test.deleteMany({})"
# Verify data loss
mongosh --eval "use $TEST_DB; db.test.count()"
# Restore from backup
mongorestore --db=$TEST_DB $BACKUP_FILE/$TEST_DB
# Verify recovery
mongosh --eval "use $TEST_DB; db.test.count()"
Common Pitfalls
- Insufficient testing: Never assume backups work—regularly test restoration procedures
- No retention policy: Implement automated cleanup to prevent disk space exhaustion
- Missing oplog backups: Without oplog, you cannot perform point-in-time recovery
- Backup location: Store backups separately from production servers (3-2-1 rule: 3 copies, 2 media types, 1 offsite)
- Security neglect: Encrypt backup files containing sensitive data
- Sharded cluster complexity: Remember to backup config servers and stop balancer during full backups
Summary
In this lesson, you've learned comprehensive MongoDB backup and recovery strategies:
- Logical backups using mongodump/mongorestore for flexibility
- Physical backups via filesystem snapshots for performance
- Point-in-time recovery leveraging MongoDB's oplog
- Automated strategies for different backup frequencies
- Recovery testing procedures to ensure backup reliability
Remember that a robust backup strategy combines multiple approaches and includes regular testing. Your backup solution should align with your organization's Recovery Time Objective (RTO) and Recovery Point Objective (RPO).
Quiz
Show quiz
-
What is the main advantage of using filesystem snapshots over mongodump for large databases?
- A) Better compression
- B) Faster backup and restore times
- C) More selective backup options
- D) Built-in encryption
-
Which command should you use before taking a filesystem snapshot to ensure data consistency?
- A) db.shutdownServer()
- B) db.fsyncLock()
- C) db.backup()
- D) db.flushAll()
-
What does the --oplog flag provide in mongodump?
- A) Better compression
- B) Point-in-time recovery capability
- C) Faster backup speed
- D) Automatic cleanup
-
Why is it important to regularly test backup restoration?
- A) To improve backup speed
- B) To verify backup integrity and procedure
- C) To reduce storage costs
- D) To comply with licensing requirements
-
In a sharded cluster, what additional step is crucial before taking a full backup?
- A) Stop all application connections
- B) Disable authentication
- C) Stop the balancer
- D) Increase oplog size
Answers:
- B) Faster backup and restore times
- B) db.fsyncLock()
- B) Point-in-time recovery capability
- B) To verify backup integrity and procedure
- C) Stop the balancer