Skip to main content

Ethereum blockchain network metrics collection infrastructure with zero-gap guarantee (multi-chain support planned)

Project description

Gapless Network Data

Ethereum blockchain network metrics for financial time series forecasting.

Quick Start

import gapless_network_data as gmd

# Fetch latest blocks (recommended for live trading)
df = gmd.fetch_blocks(limit=1000)

# Compute block utilization (alpha feature #2)
df['utilization'] = df['gas_used'] / df['gas_limit']

# Date range query
df = gmd.fetch_blocks(start='2024-01-01', end='2024-01-31')

Installation

pip install gapless-network-data

AI Agent Discoverability

This package is optimized for Claude Code CLI and AI coding agents.

Probe Module

import gapless_network_data as gmd

# Get ranked features for ML pipelines
features = gmd.probe.get_alpha_features()
for f in features[:5]:
    print(f"{f.rank}. {f.name} ({f.importance})")

# Get protocol era boundaries
eras = gmd.probe.get_protocol_eras()

# Get setup workflow
workflow = gmd.probe.get_setup_workflow()

# Get example code
print(gmd.probe.get_quick_start())

Alpha Feature Rankings

AI agents should prioritize these features for ML pipelines:

Rank Feature Importance Formula
1 base_fee_per_gas critical raw
2 block_utilization critical gas_used / gas_limit
3 transaction_count high raw
4 timestamp high raw
5 number high raw
6 size medium raw
7 blob_gas_used medium raw (post-EIP4844)
8 excess_blob_gas low raw (post-EIP4844)
9 gas_limit low raw
10 gas_used low raw

Get rankings programmatically: gmd.probe.get_alpha_features()

Protocol Era Boundaries

Filter data appropriately based on protocol changes:

  • EIP-1559 (block 12,965,000, Aug 2021): base_fee_per_gas introduced
  • The Merge (block 15,537,394, Sep 2022): difficulty=0 forever
  • EIP-4844 (block 19,426,587, Mar 2024): blob_gas fields introduced

Get eras programmatically: gmd.probe.get_protocol_eras()

API Reference

fetch_blocks()

gmd.fetch_blocks(
    start: str | None = None,     # ISO 8601 date
    end: str | None = None,       # ISO 8601 date
    limit: int | None = None,     # Max blocks
    include_deprecated: bool = False  # Include difficulty fields
) -> pd.DataFrame

Returns pandas DataFrame with columns:

  • timestamp (datetime64[ns, UTC])
  • number (uint64)
  • gas_limit, gas_used, base_fee_per_gas, transaction_count, size (uint64)
  • blob_gas_used, excess_blob_gas (uint64, nullable)

Deprecated Fields

Excluded by default (use include_deprecated=True for pre-Merge analysis):

  • difficulty: Always 0 post-Merge (Sep 2022)
  • total_difficulty: Frozen post-Merge

Setup

Credentials via .env file (simplest), Doppler (recommended for teams), or environment variables.

# Option 1: .env file (simplest for small teams)
# Create .env in your project root:
CLICKHOUSE_HOST_READONLY=<host>
CLICKHOUSE_USER_READONLY=<user>
CLICKHOUSE_PASSWORD_READONLY=<password>

# Option 2: Doppler (recommended for production)
doppler configure set token <token_from_1password>
doppler setup --project gapless-network-data --config prd

# Option 3: Environment variables
export CLICKHOUSE_HOST_READONLY=<host>
export CLICKHOUSE_USER_READONLY=<user>
export CLICKHOUSE_PASSWORD_READONLY=<password>

Get setup instructions: gmd.probe.get_setup_workflow()

Data Coverage

  • Blocks: 23.87M Ethereum blocks (2015-2025)
  • Update frequency: Real-time (~12 second intervals)
  • Storage: ClickHouse Cloud (AWS)
  • Deduplication: Automatic via ReplacingMergeTree

Exceptions

All exceptions include structured context (timestamp, endpoint, HTTP status):

  • CredentialException: Credential resolution failed
  • DatabaseException: ClickHouse query failed
  • MempoolException: Base exception class

Feature Engineering Integration

Combine with OHLCV price data:

import gapless_crypto_data as gcd
import gapless_network_data as gmd

# Fetch both data sources
df_ohlcv = gcd.get_data(symbol="ETHUSDT", timeframe="1m", start_date="2024-01-01")
df_blocks = gmd.fetch_blocks(start="2024-01-01", end="2024-01-02")

# Temporal alignment (forward-fill prevents data leakage)
df_blocks_aligned = df_blocks.set_index('timestamp').reindex(
    df_ohlcv.index, method='ffill'
)

# Join and engineer features
df = df_ohlcv.join(df_blocks_aligned)
df['gas_pressure'] = df['base_fee_per_gas'] / df['base_fee_per_gas'].rolling(60).median()
df['block_utilization'] = df['gas_used'] / df['gas_limit']

Infrastructure (Reference)

Dual-pipeline architecture for production reliability:

Component Purpose Technology
BigQuery Sync Hourly batch from public dataset Cloud Run Job
Real-Time Collector Block-level streaming e2-micro VM
Database Storage with deduplication ClickHouse Cloud
Monitoring Dead Man's Switch Healthchecks.io

Related Projects

Documentation

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gapless_network_data-4.5.0.tar.gz (946.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gapless_network_data-4.5.0-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file gapless_network_data-4.5.0.tar.gz.

File metadata

  • Download URL: gapless_network_data-4.5.0.tar.gz
  • Upload date:
  • Size: 946.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.14 {"installer":{"name":"uv","version":"0.9.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for gapless_network_data-4.5.0.tar.gz
Algorithm Hash digest
SHA256 4117ae188e869446946e7038b155bbfb7e63fbfd1a498253a8c78710162be994
MD5 ccca15f0d0cad6fce3abe75e128e27b9
BLAKE2b-256 5739e38e274c13a066b508a51b040b4b116a426e0a91d0df3f41da6f891a00c6

See more details on using hashes here.

File details

Details for the file gapless_network_data-4.5.0-py3-none-any.whl.

File metadata

  • Download URL: gapless_network_data-4.5.0-py3-none-any.whl
  • Upload date:
  • Size: 20.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.14 {"installer":{"name":"uv","version":"0.9.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for gapless_network_data-4.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 78f67c32672057a6e6a2d048be0d84042de103a88304a39f73bdacc04a576e75
MD5 27ac478a11a8aa2b81e80d0de3431bff
BLAKE2b-256 1d21ca215b59fd5856852141c6b57f5110c506f81e1e5439e813b7dcf0bb7578

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page