Rust-powered blockchain analytics with GPU acceleration
Project description
ChainSwarm Analyzers Baseline
Baseline analytics algorithms for blockchain pattern detection and feature engineering.
This package provides the official baseline implementation for the ChainSwarm Analytics Tournament.
Overview
chainswarm-analyzers-baseline extracts core analytical algorithms from analytics-pipeline/packages/analyzers/ and provides:
- Feature Computation - 70+ features per address including volume, graph, temporal, and behavioral metrics
- Pattern Detection - 7 pattern types: cycles, layering paths, smurfing networks, motifs, proximity risk, temporal bursts, and threshold evasion
Installation
pip install chainswarm-analyzers-baseline
From Source
cd analyzers-baseline
pip install -e .
Quick Start
Production Usage (ClickHouse)
from chainswarm_core.db import ClientFactory, get_connection_params
from chainswarm_analyzers_baseline import BaselineAnalyzersPipeline
from chainswarm_analyzers_baseline.adapters import ClickHouseAdapter
# Get connection params from environment (uses CLICKHOUSE_* env vars)
connection_params = get_connection_params(
network="torus",
database_prefix="analytics"
)
# Create client using chainswarm-core
factory = ClientFactory(connection_params)
client = factory.create_client()
# Create adapter with ClickHouse client
adapter = ClickHouseAdapter(client=client, network="torus")
# Run pipeline
pipeline = BaselineAnalyzersPipeline(adapter=adapter)
result = pipeline.run(
start_timestamp_ms=1700000000000,
end_timestamp_ms=1702600000000,
window_days=30,
processing_date="2025-01-15",
network="torus"
)
Tournament Testing (Parquet)
from chainswarm_analyzers_baseline import BaselineAnalyzersPipeline
from chainswarm_analyzers_baseline.adapters import ParquetAdapter
# Create adapter with file paths
adapter = ParquetAdapter(
input_path="./input",
output_path="./output"
)
# Run pipeline
pipeline = BaselineAnalyzersPipeline(adapter=adapter)
result = pipeline.run(
start_timestamp_ms=1700000000000,
end_timestamp_ms=1702600000000,
window_days=30,
processing_date="2025-01-15",
network="torus"
)
CLI Usage
The CLI auto-extracts metadata (network, date, window) from the input path structure:
data/input/{network}/{processing_date}/{window_days}/
# Run full pipeline - metadata auto-extracted from path
run-pipeline --input data/input/torus/2025-01-15/30
# Output auto-constructed as data/output/torus/2025-01-15/30/
# Override extracted values if needed
run-pipeline \
--input data/input/torus/2025-01-15/30 \
--output ./custom-output \
--network bittensor # Override network
# Run full pipeline (ClickHouse mode - uses CLICKHOUSE_* env vars)
run-pipeline \
--clickhouse \
--network torus \
--window-days 30 \
--processing-date 2025-01-15
# Run features only
run-features --input data/input/torus/2025-01-15/30
# Run patterns only
run-patterns --input data/input/torus/2025-01-15/30
Package Structure
chainswarm_analyzers_baseline/
├── protocols/ # Abstract interfaces (Python Protocols)
├── features/ # Feature computation implementations
├── patterns/ # Pattern detection implementations
├── adapters/ # I/O adapters (Parquet, ClickHouse)
├── graph/ # Graph building utilities
├── pipeline/ # Production pipeline
├── config/ # Configuration management
└── scripts/ # Script entry points
Data Directory Structure
For Parquet mode, the path structure encodes metadata:
data/
├── input/{network}/{processing_date}/{window_days}/
│ ├── transfers.parquet # Required for pattern detection
│ ├── money_flows.parquet # Required (pre-aggregated edge data)
│ ├── assets.parquet # Optional
│ ├── asset_prices.parquet # Optional
│ └── address_labels.parquet # Optional
└── output/{network}/{processing_date}/{window_days}/
├── features.parquet
├── patterns_cycle.parquet
├── patterns_layering.parquet
└── ...
Note: Both ParquetAdapter and ClickHouseAdapter support full pattern detection including temporal burst analysis (added in v0.2.2).
Data Schemas
All data schemas match data-pipeline core tables for compatibility.
Input Files (Parquet Mode)
transfers.parquet
Balance transfer data matching core_transfers schema:
| Column | Type | Description |
|---|---|---|
tx_id |
String | Transaction hash (EVM/Substrate/UTXO) |
event_index |
String | Event index within transaction |
edge_index |
String | Edge disambiguator (UTXO) |
block_height |
UInt32 | Block number |
block_timestamp |
UInt64 | Milliseconds since epoch |
from_address |
String | Source address |
to_address |
String | Destination address |
asset_symbol |
String | Asset symbol (TAO, USDT, etc.) |
asset_contract |
String | Contract address or 'native' |
amount |
Decimal128(18) | Native token amount |
amount_usd |
Decimal128(18) | USD value at transaction time |
fee |
Decimal128(18) | Transaction fee |
Output Files
| File | Description |
|---|---|
features.parquet |
70+ computed features per address |
patterns_cycle.parquet |
Cycle patterns |
patterns_layering.parquet |
Layering path patterns |
patterns_network.parquet |
Smurfing network patterns |
patterns_proximity.parquet |
Proximity risk patterns |
patterns_motif.parquet |
Fan-in/fan-out motif patterns |
patterns_burst.parquet |
Temporal burst patterns |
patterns_threshold.parquet |
Threshold evasion patterns |
Pattern Types
Pattern types use lowercase values from chainswarm_core.constants.PatternTypes:
| Pattern Type | Value | Description |
|---|---|---|
| Cycle | cycle |
Circular transaction patterns |
| Layering Path | layering_path |
Long transaction chains |
| Smurfing Network | smurfing_network |
Fragmented value transfers |
| Motif Fan-In | motif_fanin |
Many-to-one patterns |
| Motif Fan-Out | motif_fanout |
One-to-many patterns |
| Proximity Risk | proximity_risk |
Distance to risky addresses |
| Temporal Burst | temporal_burst |
High-frequency activity |
| Threshold Evasion | threshold_evasion |
Structuring below limits |
Requirements
- Python >= 3.13
- chainswarm-core >= 0.1.13 (provides clickhouse-connect, loguru, pydantic)
- networkx >= 3.0
- numpy >= 1.24
- pandas >= 2.0
- pyarrow >= 14.0
- click >= 8.0
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chainswarm_analyzers_baseline-0.3.1.tar.gz.
File metadata
- Download URL: chainswarm_analyzers_baseline-0.3.1.tar.gz
- Upload date:
- Size: 8.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64ea95262dcf4d4fe5de657f36ea27b9eb754586340cfd3ee1d5a0a8c5551a14
|
|
| MD5 |
3108ce0f24394cff9c37841768f2919f
|
|
| BLAKE2b-256 |
8659f19ecd875dda66eb2377de9283d3e78c37872d449f30508ac9f6a6833305
|
File details
Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-win_amd64.whl.
File metadata
- Download URL: chainswarm_analyzers_baseline-0.3.1-cp312-abi3-win_amd64.whl
- Upload date:
- Size: 2.5 MB
- Tags: CPython 3.12+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e175f4342426e70265f57aea52a435d9cd781ccb0a8bbf085e18801e47f02be
|
|
| MD5 |
691ee8eda10b7bd5306d9836adbb91c8
|
|
| BLAKE2b-256 |
1eb1fc7479e01490911461732af7b57c2a128d7437d1238b7057ef33fd06bb2a
|
File details
Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 2.4 MB
- Tags: CPython 3.12+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d71bee06b13893c60810581bec48429a91875dc556186e96845503f9166f1dd
|
|
| MD5 |
aa47aae64e24b4fe7b7c30091d8fb5bc
|
|
| BLAKE2b-256 |
7a7a0e4e8c982393689040b0cbd4a160e60f3fc53797038a4114a4a0af2a3a7d
|
File details
Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: chainswarm_analyzers_baseline-0.3.1-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.12+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce1342ecf079590af2445b11f54883d8876e06653102e6ceb4a8e27e7aa85215
|
|
| MD5 |
6455f177663b85801b6f17aca79b2cd2
|
|
| BLAKE2b-256 |
9650db1b4deb52f0caba066781656e7a8182b5bdcf66ed9850018f80bfb9bbfe
|
File details
Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 2.1 MB
- Tags: CPython 3.12+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3296495fcfbf1170c26570be22ea8df52e61e0ec6639c36e775c546783f22d7a
|
|
| MD5 |
a00723f4e00025de9d71fb62977f6117
|
|
| BLAKE2b-256 |
8a569f4cd09c6fe5e845dab31ddc8adcbaf03190ce3dc97d90431fd5207d1ad2
|
File details
Details for the file chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: chainswarm_analyzers_baseline-0.3.1-cp312-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 2.3 MB
- Tags: CPython 3.12+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81605d81b467d5afa8be41fe10bad6156fa89ad2210c336e8ccfb403679f1d56
|
|
| MD5 |
1d6230389f960f17f6cc56e548c6d763
|
|
| BLAKE2b-256 |
1126e528b3eb4a158ea59cb5e5bdd7ad0432b7fb5275002eee24a286b6a034c7
|