Skip to main content

General toolset to backup & restore with random/filtered/anonymized data (Mongo/Postgres/GCS).

Project description

Datacycle

Getting started

cp .env.example .env
vim .env
source .env

poetry install --extras all
poetry run datacycle
docker build -f Dockerfile -t datacycle .
docker run -it --rm --env-file .env datacycle

Mac requirements

brew install mongodb/brew/mongodb-database-tools
brew install libpq
brew link --force libpq
npm install elasticdump -g

Linux requirements

apt install -y mongo-tools
apt install -y postgresql-client
npm install elasticdump -g

How to

datacycle --help
datacycle doctor

datacycle mongo "mongodb://user:password@localhost:27017/test1?authSource=admin" "mongodb://user:password@localhost:27017/test2?authSource=admin" --transform "
    transforms {
        test1 {
            before-transform {}
        }
    }
"

datacycle mongo mongodb://user:password@localhost:27017/test1?authSource=admin gs://datacycle-test/test1/snapshot --transform ops.hocon

datacycle mongo mongodb://user:password@localhost:27017/test1?authSource=admin mongodb://user:password@localhost:27017/test2?authSource=admin
datacycle mongo mongodb://user:password@localhost:27017/test1?authSource=admin gs://datacycle-test/test1/snapshot
datacycle mongo mongodb://user:password@localhost:27017/test1?authSource=admin test1/snapshot

datacycle mongo gs://datacycle-test/test1/snapshot mongodb://user:password@localhost:27017/test2?authSource=admin
datacycle mongo gs://datacycle-test/test1/snapshot gs://datacycle-test/test2/snapshot
datacycle mongo gs://datacycle-test/test1/snapshot test2/snapshot

datacycle mongo test1/snapshot mongodb://user:password@localhost:27017/test2?authSource=admin
datacycle mongo test1/snapshot gs://datacycle-test/test2/snapshot
datacycle mongo test1/snapshot test2/snapshot

Providers

Postgres

https://www.postgresql.org/docs/9.1/backup.html

  • SQL dump
  • file system snapshot
  • continuous archiving
pg_dump --clean "postgres://user:password@localhost:5432/test" | gzip > dump.gz
gunzip -c dump.gz | psql "postgres://user:password@localhost:5432/test"

Mongo

https://docs.mongodb.com/manual/core/backups/

  • BSON dump
  • file system snapshot
  • CDC
mongodump --uri="mongodb://user:password@localhost:27017/test?authSource=admin" --out=dump --numParallelCollections=10 -v --gzip
mongorestore --uri="mongodb://user:password@localhost:27017/test?authSource=admin" dump/test --numParallelCollections=10 -v --gzip

Elasticsearch

https://github.com/elasticsearch-dump/elasticsearch-dump

  • dump
elasticdump --input=https://localhost:9200 --output=$ --limit 2000 | gzip > dump.gz

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacycle-0.0.3.tar.gz (9.5 kB view details)

Uploaded Source

Built Distribution

datacycle-0.0.3-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file datacycle-0.0.3.tar.gz.

File metadata

  • Download URL: datacycle-0.0.3.tar.gz
  • Upload date:
  • Size: 9.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.10.4 Linux/5.13.0-1021-azure

File hashes

Hashes for datacycle-0.0.3.tar.gz
Algorithm Hash digest
SHA256 3949711c60d67c5b714c1f0ee74cccbf600528eba3490de71ad65d60357abcef
MD5 a259f92e69470d3802999ac5cff76b14
BLAKE2b-256 bc83ae63e573ece463881beea36e4a583a960ead3d4b4da76c9b72b58dc6c5bb

See more details on using hashes here.

File details

Details for the file datacycle-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: datacycle-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.10.4 Linux/5.13.0-1021-azure

File hashes

Hashes for datacycle-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d157c7c266fd70daa9cd14a40741b46bb0d23a01f09f85cdb08cc68728283655
MD5 68e0dd9a22149ff6e2b61880a5124a04
BLAKE2b-256 0d3061ba5b470fdebdd04f44a3b164c8b153f1903344dacac51909b64ddeb51c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page