Skip to main content

Rust-powered blockchain analytics with GPU acceleration

Project description

ChainSwarm Analyzers Baseline

PyPI version License

Baseline analytics algorithms for blockchain pattern detection and feature engineering.

This package provides the official baseline implementation for the ChainSwarm Analytics Tournament.

Overview

chainswarm-analyzers-baseline extracts core analytical algorithms from analytics-pipeline/packages/analyzers/ and provides:

  1. Feature Computation - 70+ features per address including volume, graph, temporal, and behavioral metrics
  2. Pattern Detection - 7 pattern types: cycles, layering paths, smurfing networks, motifs, proximity risk, temporal bursts, and threshold evasion

Installation

pip install chainswarm-analyzers-baseline

From Source

cd analyzers-baseline
pip install -e .

Quick Start

Production Usage (ClickHouse)

from chainswarm_core.db import ClientFactory, get_connection_params
from chainswarm_analyzers_baseline import BaselineAnalyzersPipeline
from chainswarm_analyzers_baseline.adapters import ClickHouseAdapter

# Get connection params from environment (uses CLICKHOUSE_* env vars)
connection_params = get_connection_params(
    network="torus",
    database_prefix="analytics"
)

# Create client using chainswarm-core
factory = ClientFactory(connection_params)
client = factory.create_client()

# Create adapter with ClickHouse client
adapter = ClickHouseAdapter(client=client, network="torus")

# Run pipeline
pipeline = BaselineAnalyzersPipeline(adapter=adapter)
result = pipeline.run(
    start_timestamp_ms=1700000000000,
    end_timestamp_ms=1702600000000,
    window_days=30,
    processing_date="2025-01-15",
    network="torus"
)

Tournament Testing (Parquet)

from chainswarm_analyzers_baseline import BaselineAnalyzersPipeline
from chainswarm_analyzers_baseline.adapters import ParquetAdapter

# Create adapter with file paths
adapter = ParquetAdapter(
    input_path="./input",
    output_path="./output"
)

# Run pipeline
pipeline = BaselineAnalyzersPipeline(adapter=adapter)
result = pipeline.run(
    start_timestamp_ms=1700000000000,
    end_timestamp_ms=1702600000000,
    window_days=30,
    processing_date="2025-01-15",
    network="torus"
)

CLI Usage

The CLI auto-extracts metadata (network, date, window) from the input path structure:

data/input/{network}/{processing_date}/{window_days}/
# Run full pipeline - metadata auto-extracted from path
run-pipeline --input data/input/torus/2025-01-15/30

# Output auto-constructed as data/output/torus/2025-01-15/30/

# Override extracted values if needed
run-pipeline \
    --input data/input/torus/2025-01-15/30 \
    --output ./custom-output \
    --network bittensor  # Override network

# Run full pipeline (ClickHouse mode - uses CLICKHOUSE_* env vars)
run-pipeline \
    --clickhouse \
    --network torus \
    --window-days 30 \
    --processing-date 2025-01-15

# Run features only
run-features --input data/input/torus/2025-01-15/30

# Run patterns only
run-patterns --input data/input/torus/2025-01-15/30

Package Structure

chainswarm_analyzers_baseline/
├── protocols/      # Abstract interfaces (Python Protocols)
├── features/       # Feature computation implementations
├── patterns/       # Pattern detection implementations
├── adapters/       # I/O adapters (Parquet, ClickHouse)
├── graph/          # Graph building utilities
├── pipeline/       # Production pipeline
├── config/         # Configuration management
└── scripts/        # Script entry points

Data Directory Structure

For Parquet mode, the path structure encodes metadata:

data/
├── input/{network}/{processing_date}/{window_days}/
│   ├── transfers.parquet           # Required for pattern detection
│   ├── money_flows.parquet         # Required (pre-aggregated edge data)
│   ├── assets.parquet              # Optional
│   ├── asset_prices.parquet        # Optional
│   └── address_labels.parquet      # Optional
└── output/{network}/{processing_date}/{window_days}/
    ├── features.parquet
    ├── patterns_cycle.parquet
    ├── patterns_layering.parquet
    └── ...

Note: Both ParquetAdapter and ClickHouseAdapter support full pattern detection including temporal burst analysis (added in v0.2.2).

Data Schemas

All data schemas match data-pipeline core tables for compatibility.

Input Files (Parquet Mode)

transfers.parquet

Balance transfer data matching core_transfers schema:

Column Type Description
tx_id String Transaction hash (EVM/Substrate/UTXO)
event_index String Event index within transaction
edge_index String Edge disambiguator (UTXO)
block_height UInt32 Block number
block_timestamp UInt64 Milliseconds since epoch
from_address String Source address
to_address String Destination address
asset_symbol String Asset symbol (TAO, USDT, etc.)
asset_contract String Contract address or 'native'
amount Decimal128(18) Native token amount
amount_usd Decimal128(18) USD value at transaction time
fee Decimal128(18) Transaction fee

Output Files

File Description
features.parquet 70+ computed features per address
patterns_cycle.parquet Cycle patterns
patterns_layering.parquet Layering path patterns
patterns_network.parquet Smurfing network patterns
patterns_proximity.parquet Proximity risk patterns
patterns_motif.parquet Fan-in/fan-out motif patterns
patterns_burst.parquet Temporal burst patterns
patterns_threshold.parquet Threshold evasion patterns

Pattern Types

Pattern types use lowercase values from chainswarm_core.constants.PatternTypes:

Pattern Type Value Description
Cycle cycle Circular transaction patterns
Layering Path layering_path Long transaction chains
Smurfing Network smurfing_network Fragmented value transfers
Motif Fan-In motif_fanin Many-to-one patterns
Motif Fan-Out motif_fanout One-to-many patterns
Proximity Risk proximity_risk Distance to risky addresses
Temporal Burst temporal_burst High-frequency activity
Threshold Evasion threshold_evasion Structuring below limits

Requirements

  • Python >= 3.13
  • chainswarm-core >= 0.1.13 (provides clickhouse-connect, loguru, pydantic)
  • networkx >= 3.0
  • numpy >= 1.24
  • pandas >= 2.0
  • pyarrow >= 14.0
  • click >= 8.0

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chainswarm_analyzers_baseline-0.3.1.tar.gz (8.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chainswarm_analyzers_baseline-0.3.1-cp312-abi3-win_amd64.whl (2.5 MB view details)

Uploaded CPython 3.12+Windows x86-64

chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ x86-64

chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.2 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ ARM64

chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.12+macOS 11.0+ ARM64

chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_10_12_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.12+macOS 10.12+ x86-64

File details

Details for the file chainswarm_analyzers_baseline-0.3.1.tar.gz.

File metadata

File hashes

Hashes for chainswarm_analyzers_baseline-0.3.1.tar.gz
Algorithm Hash digest
SHA256 64ea95262dcf4d4fe5de657f36ea27b9eb754586340cfd3ee1d5a0a8c5551a14
MD5 3108ce0f24394cff9c37841768f2919f
BLAKE2b-256 8659f19ecd875dda66eb2377de9283d3e78c37872d449f30508ac9f6a6833305

See more details on using hashes here.

File details

Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for chainswarm_analyzers_baseline-0.3.1-cp312-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 1e175f4342426e70265f57aea52a435d9cd781ccb0a8bbf085e18801e47f02be
MD5 691ee8eda10b7bd5306d9836adbb91c8
BLAKE2b-256 1eb1fc7479e01490911461732af7b57c2a128d7437d1238b7057ef33fd06bb2a

See more details on using hashes here.

File details

Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6d71bee06b13893c60810581bec48429a91875dc556186e96845503f9166f1dd
MD5 aa47aae64e24b4fe7b7c30091d8fb5bc
BLAKE2b-256 7a7a0e4e8c982393689040b0cbd4a160e60f3fc53797038a4114a4a0af2a3a7d

See more details on using hashes here.

File details

Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ce1342ecf079590af2445b11f54883d8876e06653102e6ceb4a8e27e7aa85215
MD5 6455f177663b85801b6f17aca79b2cd2
BLAKE2b-256 9650db1b4deb52f0caba066781656e7a8182b5bdcf66ed9850018f80bfb9bbfe

See more details on using hashes here.

File details

Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3296495fcfbf1170c26570be22ea8df52e61e0ec6639c36e775c546783f22d7a
MD5 a00723f4e00025de9d71fb62977f6117
BLAKE2b-256 8a569f4cd09c6fe5e845dab31ddc8adcbaf03190ce3dc97d90431fd5207d1ad2

See more details on using hashes here.

File details

Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 81605d81b467d5afa8be41fe10bad6156fa89ad2210c336e8ccfb403679f1d56
MD5 1d6230389f960f17f6cc56e548c6d763
BLAKE2b-256 1126e528b3eb4a158ea59cb5e5bdd7ad0432b7fb5275002eee24a286b6a034c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page