Skip to main content

A collection of blockchain data pipelines built with cherry

Project description

cherry-pipelines

This is a collection of pipelines that are built using cherry and ClickHouse materialized views.

All data is stored in ClickHouse.

Python version

This project is meant to be run with Python 3.12

If you are using uv for development it should pick this up automatically because of the .python-version in the project root.

The docker image is configured to use this version of Python as well.

Running a pipeline

Use the main script to run a pipeline:

uv run scripts/main.py

It takes these parameters as environment variables:

  • CHERRY_PIPELINE_KIND, "evm" or "svm".
  • CHERRY_PIPELINE_NAME, name of the pipeline to run e.g. "erc20_transfers".
  • CHERRY_FROM_BLOCK, specify the block that the indexing should start from. defaults to 0.
  • CHERRY_TO_BLOCK, specify the block that the indexing should stop at. has no default. Indexing waits for new blocks when it reaches the tip of the chain if this argument is left empty.
  • CHERRY_EVM_PROVIDER_KIND, specify which provider to use when indexing evm chains. Can be hypersync or sqd. Has no default and is required when indexing evm.
  • CHERRY_EVM_CHAIN_ID, specify the chain_id when indexing an evm chain. has no default and is required when indexing evm.
  • CHERRY_PROVIDER_BUFFER_SIZE, specify buffering between ingestion - processing - writer. Increasing this parameter might improve performance but can also cause higher memory usage. Defaults to 2.
  • CLICKHOUSE_HOST, defaults to 127.0.0.1.
  • CLICKHOUSE_PORT, defaults to 8123.
  • CLICKHOUSE_USER, defaults to default.
  • CLICKHOUSE_PASSWORD, defaults to empty string,
  • RUST_LOG as explained in env-logger docs
  • PY_LOG as explained in python logging docs. Defaults to "INFO"

An .env file placed in the project root can be used to define these for development.

Running with docker

We publish a docker image that runs the main script.

Dev Setup

Run the docker-compose file to start a clickhouse instance for development.

docker-compose up -d

Run this to delete the data on disk:

docker-compose down -v

And this to stop the container without deleting the data:

docker-compose down

Development

This repo uses uv for development.

  • Format the code with uv run ruff format
  • Lint the code with uv run ruff check
  • Run type checks with uv run pyright
  • Run the tests with uv run pytest

Data Provider

All svm pipelines use SQD.

All evm pipelines are configurable using the CHERRY_EVM_PROVIDER_KIND env variable.

Materialized Views

Materialized views are defined in SQL files with an accompanying script that deploys them.

EVM multi-chain structure

The evm pipelines are multi-chain and index multiple blockchains in parallel.

All chains are written to their own tables. For example the table for erc20 transfers would have a table named erc20_chain1 for ethereum and erc20_chain10 for optimism.

Specify the CHERRY_EVM_CHAIN_ID env variable to set the chain you want to index when indexing evm.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Sponsors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cherry_pipelines-0.0.9.tar.gz (77.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cherry_pipelines-0.0.9-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file cherry_pipelines-0.0.9.tar.gz.

File metadata

  • Download URL: cherry_pipelines-0.0.9.tar.gz
  • Upload date:
  • Size: 77.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.4

File hashes

Hashes for cherry_pipelines-0.0.9.tar.gz
Algorithm Hash digest
SHA256 1ba87ecafc856721633ffd193dc3fda7a05ff2e9450a6b8d19f34e26b60de7a4
MD5 048f71b282dbebabcf854411c114d41d
BLAKE2b-256 eb3bac2a0db4d764fd88fcb6d25565bcd3dd2cccddd7f23d111434fa67dd7384

See more details on using hashes here.

File details

Details for the file cherry_pipelines-0.0.9-py3-none-any.whl.

File metadata

File hashes

Hashes for cherry_pipelines-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 5eb0d3f8a97c8382eb734a7eddcf3f2f2380e5ba727702abb9ca0fc9c2360409
MD5 3004628e06d691ab5f9cd7a2c82a5eb2
BLAKE2b-256 896cad1631f706bdd96615bf83995e5beee22c9c0a888e5ef85c4972d21d0a42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page