Skip to main content

No project description provided

Project description

Setup virtual environment

Create virtual environment and install the dependencies:

python3 -m venv ./venv
source ./venv/bin/activate

pip install -r ./requirements.txt --upgrade
pip install -r ./requirements-dev.txt --upgrade

Generate schema files

python3 -m dharitrietl generate-schema --input-folder=~/drt-go-chain-tools/elasticreindexer/cmd/indices-creator/config/noKibana/ --output-folder=./schema

Quickstart

First, set the following environment variables:

export WORKSPACE=${HOME}/dharitri-etl
export INDEXER_URL=https://index.dharitri.org:443
export GCP_PROJECT_ID=dharitri-blockchain-etl
export BQ_DATASET=mainnet
export START_TIMESTAMP=1596117600
export END_TIMESTAMP=1687880000

Then, plan ETL tasks (will add records in a Firestore database):

python3 -m dharitrietl plan-tasks-with-intervals --indexer-url=${INDEXER_URL} \
    --gcp-project-id=${GCP_PROJECT_ID}  --bq-dataset=${BQ_DATASET} \
    --start-timestamp=${START_TIMESTAMP} --end-timestamp=${END_TIMESTAMP}

python3 -m dharitrietl plan-tasks-without-intervals --indexer-url=${INDEXER_URL} \
    --gcp-project-id=${GCP_PROJECT_ID}  --bq-dataset=${BQ_DATASET}

Note: in order to remove all previously planned tasks, run the following commands:

firebase firestore:delete --project=${GCP_PROJECT_ID} --recursive tasks_with_interval
firebase firestore:delete --project=${GCP_PROJECT_ID} --recursive tasks_without_interval

Inspect the tasks:

python3 -m dharitrietl inspect-tasks --gcp-project-id=${GCP_PROJECT_ID}

Then, extract and load the data on worker machines:

python3 -m dharitrietl extract-with-intervals --workspace=${WORKSPACE} --gcp-project-id=${GCP_PROJECT_ID}
python3 -m dharitrietl extract-without-intervals --workspace=${WORKSPACE} --gcp-project-id=${GCP_PROJECT_ID}
python3 -m dharitrietl load-with-intervals --workspace=${WORKSPACE} --gcp-project-id=${GCP_PROJECT_ID} --schema-folder=./schema
python3 -m dharitrietl load-without-intervals --workspace=${WORKSPACE} --gcp-project-id=${GCP_PROJECT_ID} --schema-folder=./schema

From time to time, you may want to run inspect-tasks again to check the progress.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dharitri_etl-0.0.1.tar.gz (19.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dharitri_etl-0.0.1-py3-none-any.whl (22.8 kB view details)

Uploaded Python 3

File details

Details for the file dharitri_etl-0.0.1.tar.gz.

File metadata

  • Download URL: dharitri_etl-0.0.1.tar.gz
  • Upload date:
  • Size: 19.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for dharitri_etl-0.0.1.tar.gz
Algorithm Hash digest
SHA256 1763fc287cd0c6024c067a5a238a5fddeefac6b31d461da4a82d58e3bb62fdef
MD5 b7ec0e673ecc8313d7a1e055eb927c71
BLAKE2b-256 93fe0f4dbe56e2ebf787ea7bb8c3690fbb0d337db716be0f40ba2211f7944931

See more details on using hashes here.

File details

Details for the file dharitri_etl-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: dharitri_etl-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 22.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for dharitri_etl-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ed80a3fdf5782d32ef6ec4a831461e8d0779f37c2dad09ebf6d5201285044552
MD5 d089fb8185e71f185481e7c4530f7717
BLAKE2b-256 ee26c69842f777bd758ab281572dfb42c35afd76f65a7c8ab79ad0565196557d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page