Skip to main content

A flexible blockchain indexing and data processing pipeline

Project description

Cherry Event Indexer

A flexible blockchain event indexing and data processing pipeline.

Overview

Cherry Event Indexer is a modular system for:

  • Ingesting blockchain events and logs
  • Processing and transforming blockchain data
  • Writing data to various storage backends

Features

  • Modular Pipeline Architecture

    • Configurable data providers
    • Customizable processing steps
    • Pluggable storage backends
  • Built-in Steps

    • EVM block validation
    • Event decoding
    • Custom processing steps
  • Storage Options

    • Local Parquet files
    • AWS S3
    • More coming soon...

Project Structure:

cherry/
├── src/
│ ├── config/ # Configuration parsing
│ ├── utils/ # Pipeline and utilities
│ └── writers/ # Storage backends
├── examples/ # Example implementations
├── tests/ # Test suite
└── config.yaml # Pipeline configuration

Prerequisites:

  • Python 3.10 or higher
  • Docker and Docker Compose
  • MinIO (for local S3-compatible storage)

Installation Steps

Clone the repository and go to the project root:

git clone https://github.com/steelcake/cherry.git
cd cherry

Create and activate a virtual environment:

# Create virtual environment (all platforms)
python -m venv .venv

# Activate virtual environment

# For Windows with git bash:
source .venv/Scripts/activate

# For macOS/Linux:
source .venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Set up environment variables:

Create a .env file in the project root Add your Hypersync API token:

Quick Start

  1. Create a config file (config.yaml):

  2. Run the script:

python main.py

Custom Processing Steps

To add a custom processing step, you need to:

  1. Define the step function
  2. Add the step to the context
  3. Add the step to the config

example: get_block_number_stats.py

def get_block_number_stats(data: Dict[str, pa.RecordBatch], step_config: Dict[str, Any]) -> Dict[str, pa.RecordBatch]:
    """Custom processing step for transfer events"""
    pass

config.yaml

steps:
  - name: my_get_block_number_stats
    kind: get_block_number_stats
    config:
      input_table: logs
      output_table: block_number_stats

Running the Project

Start MinIO server (for local S3 storage):

# Navigate to docker-compose directory
cd docker-compose

# Start MinIO using docker-compose
docker-compose up -d

# Return to project root
cd ..

Default credentials:

Access Key: minioadmin
Secret Key: minioadmin
Console URL: http://localhost:9001

Note: The MinIO service will be automatically configured with the correct ports and volumes as defined in the docker-compose.yml file.

Configure pipelines:

  • Open config.yaml
  • Adjust query, event filters, and batch sizes as needed for your pipeline
  • Configure writer settings (S3/local parquet etc.)

Run the indexer:

python main.py

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cherry_indexer-0.1.7.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cherry_indexer-0.1.7-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file cherry_indexer-0.1.7.tar.gz.

File metadata

  • Download URL: cherry_indexer-0.1.7.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for cherry_indexer-0.1.7.tar.gz
Algorithm Hash digest
SHA256 3fd3fad6a153730742b9a700d464ae72f83b2e1da130fb8cf8f15ce89415057b
MD5 33db5d8b0698e2cc8c014ff1a17ade9f
BLAKE2b-256 34357d6461835edcfcde4af17a95fad24f18a8d6d0e8841e394ec8d625702dd5

See more details on using hashes here.

File details

Details for the file cherry_indexer-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: cherry_indexer-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for cherry_indexer-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 5c594df674d240425ce1bab7ce7bd84924441e5a21df65a16770175daef9e2ce
MD5 4bcd0f11fe87ef6ffd8039858f9bc41c
BLAKE2b-256 74b4fb48ba7e0e40e4b1faca4434b7edd69049b7d501798ca61cdc3b9639644d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page