Skip to main content

Backup utility for MongoDB. Compatible with Azure, Amazon Web Services and Google Cloud Platform.

Project description

mongo-s3-archiver

mongo-s3-archiver is an automation-friendly wrapper around mongodump that lets you:

  • Export massive MongoDB collections (millions of docs) with server-side filtering (--query, --db, --collection).
  • Stream results into Gzip archives or folder dumps, then push them to AWS S3, Azure Blob Storage, or Google Cloud Storage through a single CLI.
  • Purge the backed-up documents in controllable batches to keep your live cluster lean.
  • Run unattended via Docker, cron, or CI/CD (GitHub workflow/publish pipeline included).
  • Integrate with notification channels (email, Telegram) to track every backup.

It is a fork of exesse/mongodump-s3 focused on reliability for long-running archival jobs, modern CI workflows, and safer defaults for production data.

About this fork
This repository is a maintained fork of exesse/mongodump-s3 by Vladislav I. Kulbatski.
Additional features and ongoing maintenance are provided by Hadi Koubeissy (123.hadikoubeissy@gmail.com).

Installation

Make sure that original MongoDB Database Tools are installed. Please follow instruction on the official page for platform specific installation. Also make sure that mongodump command is in your PATH.

pip install mongo-s3-archiver

Usage

mongo-s3-archiver could be used as command line tool or as Docker service. There are also three possible ways to pass parameters to the utility:

  • Through setting environment variables
  • By passing env file to the tool
  • Or by passing individual flags

Please refer to sample.env example for all possible env options.

Command line

$ mongo-s3-archiver --help
usage: mongo-s3-archiver <options>

Export the content of a running server into .bson files and uploads to provided S3 compatible storage. By default loads required settings from environment variables.

general options:
  -h, --help            print usage
  -v, --version         print the tool version and exit

output options:
  -b <S3 Bucket>, --bucket <S3 Bucket>
                        S3 bucket name for upload, defaults to 'mongodump'
  -o <folder>, --out <folder>
                        output directory, defaults to 'dump'

uri options:
  -u <uri>, --uri <uri>
                        mongodb uri connection string. See official description here https://docs.mongodb.com/manual/reference/connection-string

environmental options:
  -e <env-file>, --env <env-file>
                        path to file containing environmental variables

cloud storage options:
  --azure "<azure_storage_connection_string>"
                        connection string for storage account provided by Azure
  --aws "<aws_access_key_id=value> <aws_secret_access_key=value> <aws_region=value>"
                        AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION properties provided by Amazon Web Services IAM. AWS_REGION defaults to 'us-west-2' if not specified
  --gcp "<google_application_credentials=value> <google_region=value>"
                        path to service account file and optional Google Cloud Region. GOOGLE_REGION defaults to 'us-multiregion' if not specified

S3 permissions tip:
- Set `MONGO_S3_SKIP_BUCKET_DISCOVERY=true` when your IAM user lacks `s3:ListAllMyBuckets`/bucket creation rights; uploads will proceed without listing or auto-creating the bucket (make sure the bucket already exists).

notification options:
  --email <user@example.com>
                        email address which to notify upon the result
  --smtp <mail-server.example.com>
                        SMTP relay server to use, defaults to 'localhost'
  --telegram "<telegram_token=value> <telegram_chat_id=value>"
                        Telegram API token and chat id to be used for notification. See more: https://core.telegram.org/bots/api

The legacy CLI entry point mongodump-s3 is still shipped for backward compatibility and maps to the same functionality.

Filtering and archiving dumps

  • Use --db, --collection and --query (JSON) to limit the scope of mongodump.
  • Pass --archive <file.gz> to stream the output into a compressed archive instead of a folder.
  • Use -j/--jobs to control how many collections mongodump processes in parallel. --no-gzip disables compression.

Example exporting a subset of documents into a gzipped archive:

python run_mongo_s3_archiver.py \
  --uri 'mongodb+srv://<username>:<password>@cluster.example.net/app_db?authSource=admin&tls=true' \
  --db app_db \
  --collection events \
  --query '{"timestamp": {"$lt": "2025-09-31"}}' \
  --archive data/pre_2025_09_31.gz \
  -j 10 \
  --delete-after-dump \
  --delete-batch-size 1000

Post dump cleanup

Set --delete-after-dump (or MONGO_PURGE_AFTER_DUMP=true) to remove the documents that matched the dump query once the upload finishes. Documents are deleted in batches (configurable through --delete-batch-size or MONGO_PURGE_BATCH_SIZE, defaults to 1000) to keep memory usage low even when purging millions of records. This feature requires that --db/--collection and a valid JSON query are provided.

Running without installing the package

To execute the CLI straight from the repository (helpful inside CI/CD pipelines), use the provided launcher script:

python run_mongo_s3_archiver.py --help

Docker

sudo docker run --name mongo-s3-archiver [Optional: --env-file sample.env] hadikoub/mongo-s3-archiver:latest [Optional: startup flags]

In case you need to pass GCP service account key please mount the key inside container and simply specify GOOGLE_APPLICATION_CREDENTIALS=/mongo-dump/key.json.

sudo docker run --name mongo-s3-archiver-gcp \
    --env-file sample.env \
    -v ~/dev.json:/mongodump/key.json:ro \
    hadikoub/mongo-s3-archiver:latest 

Feedback

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mongo_s3_archiver-0.1.2.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mongo_s3_archiver-0.1.2-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file mongo_s3_archiver-0.1.2.tar.gz.

File metadata

  • Download URL: mongo_s3_archiver-0.1.2.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mongo_s3_archiver-0.1.2.tar.gz
Algorithm Hash digest
SHA256 d926468325f17341cfd51c29ad3601fe5b914122fda6d94ec7a2bec563febd60
MD5 f0145c0ab4c83fce777a622c5b93e157
BLAKE2b-256 c7634bd011edc42b7cfa4e3c0f9671fb75902662ec26c18489c562270b5983f9

See more details on using hashes here.

File details

Details for the file mongo_s3_archiver-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for mongo_s3_archiver-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cd42ecdc84e1a65bf18ef9b826481e98b269790989e9a20578c08149c680b3c3
MD5 b1f9795476b2502f392f11a6b679ce80
BLAKE2b-256 da9c68578755b719867124600018997a8b37e37adb82fc35392c7e8abaaf8768

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page