Skip to main content

General toolset to backup & restore with random/filtered/anonymized data (Mongo/Postgres/GCS).

Project description

Datacycle

Getting started

cp .env.example .env
vim .env
source .env

poetry install --extras all
poetry run datacycle
docker build -f Dockerfile -t datacycle .
docker run -it --rm --env-file .env datacycle

Mac requirements

brew install mongodb/brew/mongodb-database-tools
brew install libpq
brew link --force libpq
npm install elasticdump -g

Linux requirements

apt install -y mongo-tools
apt install -y postgresql-client
npm install elasticdump -g

How to

datacycle --help
datacycle doctor

datacycle mongo "mongodb://user:password@localhost:27017/test1?authSource=admin" "mongodb://user:password@localhost:27017/test2?authSource=admin" --transform "
    transforms {
        test1 {
            before-transform {}
        }
    }
"

datacycle mongo mongodb://user:password@localhost:27017/test1?authSource=admin gs://datacycle-test/test1/snapshot --transform ops.hocon

datacycle mongo mongodb://user:password@localhost:27017/test1?authSource=admin mongodb://user:password@localhost:27017/test2?authSource=admin
datacycle mongo mongodb://user:password@localhost:27017/test1?authSource=admin gs://datacycle-test/test1/snapshot
datacycle mongo mongodb://user:password@localhost:27017/test1?authSource=admin test1/snapshot

datacycle mongo gs://datacycle-test/test1/snapshot mongodb://user:password@localhost:27017/test2?authSource=admin
datacycle mongo gs://datacycle-test/test1/snapshot gs://datacycle-test/test2/snapshot
datacycle mongo gs://datacycle-test/test1/snapshot test2/snapshot

datacycle mongo test1/snapshot mongodb://user:password@localhost:27017/test2?authSource=admin
datacycle mongo test1/snapshot gs://datacycle-test/test2/snapshot
datacycle mongo test1/snapshot test2/snapshot

Providers

Postgres

https://www.postgresql.org/docs/9.1/backup.html

  • SQL dump
  • file system snapshot
  • continuous archiving
pg_dump --clean "postgres://user:password@localhost:5432/test" | gzip > dump.gz
gunzip -c dump.gz | psql "postgres://user:password@localhost:5432/test"

Mongo

https://docs.mongodb.com/manual/core/backups/

  • BSON dump
  • file system snapshot
  • CDC
mongodump --uri="mongodb://user:password@localhost:27017/test?authSource=admin" --out=dump --numParallelCollections=10 -v --gzip
mongorestore --uri="mongodb://user:password@localhost:27017/test?authSource=admin" dump/test --numParallelCollections=10 -v --gzip

Elasticsearch

https://github.com/elasticsearch-dump/elasticsearch-dump

  • dump
elasticdump --input=https://localhost:9200 --output=$ --limit 2000 | gzip > dump.gz

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacycle-0.0.2.tar.gz (9.5 kB view hashes)

Uploaded Source

Built Distribution

datacycle-0.0.2-py3-none-any.whl (11.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page