Skip to main content

Graphsense backend lib and automation cli

Project description

GraphSense Library

Test and Build Status PyPI version Python Downloads

A comprehensive Python library for the GraphSense crypto-analytics platform. It provides database access, data ingestion, maintenance tools, and analysis capabilities for cryptocurrency transactions and networks.

Note: This library uses optional dependencies. Use graphsense-lib[all] to install all features.

Quick Start

Installation

# Install with all features
uv add graphsense-lib[all]

# Install from source
git clone https://github.com/graphsense/graphsense-lib.git
cd graphsense-lib
make install

Serving the REST API locally

The web API requires two backend connections: a Cassandra cluster (blockchain data) and a TagStore (PostgreSQL). You can configure them via environment variables or a YAML config file.

Option A: Environment variables only

GS_CASSANDRA_ASYNC_NODES='["<cassandra-host>"]' \
GRAPHSENSE_TAGSTORE_READ_URL='postgresql+asyncpg://<user>:<password>@<host>:<port>/tagstore' \
GS_CASSANDRA_ASYNC_CURRENCIES='{"btc":{"raw": "btc_raw", "transformed": "btc_transformed"},"eth":{}}' \
uv run --extra web uvicorn graphsenselib.web.app:create_app --factory --host localhost --port 9000 --reload

Option B: YAML config file

Point CONFIG_FILE to a REST-specific config (see instance/config.yaml for a full example):

CONFIG_FILE=./instance/config.yaml make serve-web

Or without Make:

CONFIG_FILE=./instance/config.yaml \
uv run --extra web uvicorn graphsenselib.web.app:create_app --factory --host localhost --port 9000 --reload

Option C: .graphsense.yaml with a web key

If you already have a .graphsense.yaml (or ~/.graphsense.yaml) for the CLI, you can add a web key containing the REST config. The app will pick it up automatically without setting CONFIG_FILE:

# .graphsense.yaml
environments:
  # ... your existing CLI config ...

web:
  database:
    nodes: ["<cassandra-host>"]
    currencies:
      btc:
      eth:
  gs-tagstore:
    url: "postgresql+asyncpg://<user>:<password>@<host>:<port>/tagstore"
make serve-web

Config resolution order: explicit config_file param > CONFIG_FILE env var > ./instance/config.yaml > .graphsense.yaml web key > env vars only.

Optional REST settings (env vars)

Variable Default Description
GSREST_DISABLE_AUTH false Disable API key authentication
GSREST_ENSURE_TAGSTORE_SCHEMA_ON_STARTUP false Auto-initialize TagStore tables/views at startup when missing
GSREST_ALLOWED_ORIGINS * CORS allowed origins
GSREST_LOGGING_LEVEL Logging level (DEBUG, INFO, …)
GS_CASSANDRA_ASYNC_PORT 9042 Cassandra port
GS_CASSANDRA_ASYNC_USERNAME Cassandra username
GS_CASSANDRA_ASYNC_PASSWORD Cassandra password

When enabling GSREST_ENSURE_TAGSTORE_SCHEMA_ON_STARTUP=true, keep in mind:

  • The DB user must have DDL privileges (create tables/views/indexes/extensions/procedures).
  • Startup may be slower because schema checks and potential initialization run before the app serves traffic.
  • In multi-replica deployments, initialize schema once (migration/init job) to avoid startup races.

If TagStore is not configured (gs-tagstore missing) or the TagStore URL is unreachable, the REST app now falls back to a mock TagStore so endpoints still work. In this mode, tag-specific responses (labels, actors, taxonomies, tag counts) are empty.

Basic Usage

Database Access with Configuration File

from graphsenselib.db import DbFactory

# Using GraphSense config file (default: ~/.graphsense.yaml)
with DbFactory().from_config("development", "btc") as db:
    highest_block = db.transformed.get_highest_block()
    print(f"Highest BTC block: {highest_block}")

    # Get block details
    block = db.transformed.get_block(100000)
    print(f"Block 100000: {block.block_hash}")

Direct Database Connection

from graphsenselib.db import DbFactory

# Direct connection without config file
with DbFactory().from_name(
    raw_keyspace_name="eth_raw",
    transformed_keyspace_name="eth_transformed",
    schema_type="account",
    cassandra_nodes=["localhost"],
    currency="eth"
) as db:
    print(f"Highest block: {db.transformed.get_highest_block()}")

Async Database Services

The async services are used internally by the REST API and can also be used standalone. AddressesService depends on several other services:

from graphsenselib.db.asynchronous.services import (
    BlocksService, AddressesService, TagsService,
    EntitiesService, RatesService,
)

# Services are initialized with their dependencies
blocks_service = BlocksService(db, rates_service, config, logger)
addresses_service = AddressesService(
    db, tags_service, entities_service, blocks_service, rates_service, logger
)

address_info = await addresses_service.get_address("btc", "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa")
txs = await addresses_service.list_address_txs("btc", "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa")

Command Line Interface

GraphSense-lib exposes a comprehensive CLI tool: graphsense-cli

Basic Commands

# Show help and available commands
graphsense-cli --help

# Check version
graphsense-cli version

# Show current configuration
graphsense-cli config show

# Generate config template
graphsense-cli config template > ~/.graphsense.yaml

# Show config file path
graphsense-cli config path

Modules

Database Management

Query and manage the GraphSense database state.

# Show database management options
graphsense-cli db --help

# Check database state/summary
graphsense-cli db state -e development

# Get block information
graphsense-cli db block info -e development -c btc --height 100000

# Query logs (for Ethereum-based chains)
graphsense-cli db logs -e development -c eth --from-block 1000000 --to-block 1000100

Schema Operations

Create and validate database schemas.

# Show schema options
graphsense-cli schema --help

# Create database schema for a currency
graphsense-cli schema create -e dev -c btc

# Validate existing schema
graphsense-cli schema validate -e dev -c btc

# Show expected schema for currency
graphsense-cli schema show-by-currency btc

# Show schema by type (utxo/account)
graphsense-cli schema show-by-schema-type utxo

Data Ingestion

Ingest raw cryptocurrency data from nodes.

# Show ingestion options
graphsense-cli ingest --help

# Ingest blocks from cryptocurrency node
graphsense-cli ingest from-node \
    -e dev \
    -c btc \
    --start-block 0 \
    --end-block 1000 \
    --create-schema

# Ingest with custom batch size
graphsense-cli ingest from-node \
    -e dev \
    -c eth \
    --start-block 1000000 \
    --end-block 1001000 \
    --batch-size 100

Delta Updates

Update transformed keyspace from raw keyspace.

# Show delta update options
graphsense-cli delta-update --help

# Check update status
graphsense-cli delta-update status -e dev -c btc

# Perform delta update
graphsense-cli delta-update update -e dev -c btc

# Validate delta update consistency
graphsense-cli delta-update validate -e dev -c btc

# Patch exchange rates for specific blocks
graphsense-cli delta-update patch-exchange-rates \
    -e dev \
    -c btc \
    --start-block 100000 \
    --end-block 200000

Exchange Rates

Fetch and ingest exchange rates from various sources.

# Show exchange rate options
graphsense-cli exchange-rates --help

# Fetch from CoinDesk
graphsense-cli exchange-rates coindesk -e dev -c btc

# Fetch from CoinMarketCap (requires API key in config)
graphsense-cli exchange-rates coinmarketcap -e dev -c btc

Monitoring

Monitor GraphSense infrastructure health and state.

# Show monitoring options
graphsense-cli monitoring --help

# Get database summary
graphsense-cli monitoring get-summary -e dev

# Get summary for specific currency
graphsense-cli monitoring get-summary -e dev -c btc

# Send notifications to configured handlers
graphsense-cli monitoring notify \
    --topic "database-update" \
    --message "BTC ingestion completed"

Event Watching (Alpha)

Watch for cryptocurrency events and generate notifications.

# Show watch options
graphsense-cli watch --help

# Watch for money flows on specific addresses
graphsense-cli watch money-flows \
    -e dev \
    -c btc \
    --address 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa \
    --threshold 1000000  # satoshis

File Conversion Tools

Convert between different file formats.

# Show conversion options
graphsense-cli convert --help

Configuration

GraphSense-lib uses a YAML configuration file that defines database connections and environment settings. Default locations: ./.graphsense.yaml, ~/.graphsense.yaml.

Generate Configuration Template

graphsense-cli config template > ~/.graphsense.yaml

Example Configuration Structure

# Optional: default environment to use
default_environment: dev

environments:
  dev:
    # Cassandra cluster configuration
    cassandra_nodes: ["localhost"]
    port: 9042
    # Optional authentication
    # username: "cassandra"
    # password: "cassandra"

    # Currency/keyspace configurations
    keyspaces:
      btc:
        raw_keyspace_name: "btc_raw"
        transformed_keyspace_name: "btc_transformed"
        schema_type: "utxo"

        # Node connection for ingestion
        ingest_config:
          node_reference: "http://localhost:8332"
          # Optional authentication for node
          # username: "rpcuser"
          # password: "rpcpassword"

        # Keyspace setup for schema creation
        keyspace_setup_config:
          raw:
            replication_config: "{'class': 'SimpleStrategy', 'replication_factor': 1}"
          transformed:
            replication_config: "{'class': 'SimpleStrategy', 'replication_factor': 1}"

      eth:
        raw_keyspace_name: "eth_raw"
        transformed_keyspace_name: "eth_transformed"
        schema_type: "account"

        ingest_config:
          node_reference: "http://localhost:8545"

        keyspace_setup_config:
          raw:
            replication_config: "{'class': 'SimpleStrategy', 'replication_factor': 1}"
          transformed:
            replication_config: "{'class': 'SimpleStrategy', 'replication_factor': 1}"

  prod:
    cassandra_nodes: ["cassandra1.prod", "cassandra2.prod", "cassandra3.prod"]
    username: "gs_user"
    password: "secure_password"

    keyspaces:
      btc:
        raw_keyspace_name: "btc_raw"
        transformed_keyspace_name: "btc_transformed"
        schema_type: "utxo"

        ingest_config:
          node_reference: "http://bitcoin-node.internal:8332"

        keyspace_setup_config:
          raw:
            replication_config: "{'class': 'NetworkTopologyStrategy', 'datacenter1': 3}"
          transformed:
            replication_config: "{'class': 'NetworkTopologyStrategy', 'datacenter1': 3}"

# Optional: Slack notification configuration
slack_topics:
  database-update:
    hooks: ["https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"]

  payment_flow_notifications:
    hooks: ["https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"]

# Optional: API keys for external services
coingecko_api_key: ""
coinmarketcap_api_key: "YOUR_CMC_API_KEY"

# Optional: cache directory for temporary files
cache_directory: "~/.graphsense/cache"

Advanced Features

Tagpack Management

GraphSense-lib includes comprehensive tagpack management tools (formerly standalone tagpack-tool). For detailed documentation, see Tagpack README.

# Validate tagpacks
graphsense-cli tagpack-tool tagpack validate /path/to/tagpack

# Insert tagpack into tagstore
graphsense-cli tagpack-tool insert \
    --url "postgresql://user:pass@localhost/tagstore" \
    /path/to/tagpack

# Show quality measures
graphsense-cli tagpack-tool quality show-measures \
    --url "postgresql://user:pass@localhost/tagstore"

Tagstore Operations

# Initialize tagstore database
graphsense-cli tagstore init

# Initialize with custom database URL
graphsense-cli tagstore init --db-url "postgresql://user:pass@localhost/tagstore"

# Get DDL SQL for manual setup
graphsense-cli tagstore get-create-sql

Cross-chain Analysis

# Using an initialized AddressesService (see above for setup)
related = await addresses_service.get_cross_chain_pubkey_related_addresses(
    "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa"
)

for addr in related:
    print(f"Network: {addr.network}, Address: {addr.address}")

Function Call Parsing

from graphsenselib.utils.function_call_parser import parse_function_call

# Parse Ethereum function calls
function_signatures = {
    "0xa9059cbb": [{
        "name": "transfer",
        "inputs": [
            {"name": "to", "type": "address"},
            {"name": "value", "type": "uint256"}
        ]
    }]
}

parsed = parse_function_call(tx_input_bytes, function_signatures)
if parsed:
    print(f"Function: {parsed['name']}")
    print(f"Parameters: {parsed['parameters']}")

Development

Important: Requires Python >=3.10, <3.13.

Setup Development Environment

# Initialize development environment (installs deps + pre-commit hooks)
make dev

# Or install dev dependencies only
make install-dev

Code Quality and Testing

Before committing, please format, lint, and test your code:

# Format code
make format

# Lint code
make lint

# Run fast tests
make test

# Or run all steps at once
make pre-commit

For comprehensive testing:

# Run complete test suite (including slow tests)
make test

Podman Notes

If you run the test suite with Podman, make sure your shell points at the Podman socket:

export DOCKER_HOST="unix://${XDG_RUNTIME_DIR}/podman/podman.sock"

The test fixtures automatically disable Ryuk when DOCKER_HOST contains podman.sock and rely on explicit fixture cleanup instead.

Release Process

This repository uses two source-of-truth versions in the root Makefile:

  • Library version: RELEASESEM (released with vX.Y.Z, vX.Y.Z-rc.N, or vX.Y.Z-dev.N tags)
  • OpenAPI/API version: WEBAPISEM (written to src/graphsenselib/web/version.py)

The Python client package version is derived from the API version and should match it.

Library package versioning is dynamic via setuptools_scm (pyproject.toml):

  • Git tag v2.9.8 -> package version 2.9.8
  • Git tag v2.9.8-rc.1 -> package version 2.9.8rc1
  • Git tag v2.9.8-dev.1 -> package version 2.9.8.dev1
  • Commits after a tag append local metadata, for example 2.9.8.dev1+g<sha>.d<date>

Use the root Makefile helpers:

# Show all current versions
make show-versions

# Update and validate OpenAPI contract version
make update-api-version WEBAPISEM=v2.10.0
make check-api-version WEBAPISEM=v2.10.0

# Sync client version from API version and validate
make sync-client-version WEBAPISEM=v2.10.0
make check-client-version WEBAPISEM=v2.10.0

# Generate Python client (package version = OpenAPI info.version)
make generate-python-client

# Create both release tags from Makefile versions
make tag-version

Tagging behavior:

  • Library release tag: vX.Y.Z, vX.Y.Z-rc.N, or vX.Y.Z-dev.N (from RELEASESEM)
  • Client release tag: webapi-vA.B.C (from WEBAPISEM)

Recommended library versioning routine:

  1. For development prereleases, set RELEASESEM to vX.Y.Z-dev.N (for example v2.10.0-dev.1)
  2. For release candidates, set RELEASESEM to vX.Y.Z-rc.N
  3. For stable releases, set RELEASESEM to vX.Y.Z
  4. Create tags with make tag-version
  5. Push tags with git push origin --tags

CI trigger background:

  • Stable library tags (vX.Y.Z) trigger:
    • GitHub Release creation
    • Python library package build/publish (graphsense-lib)
    • Docker image build/publish
  • Client tags (webapi-vA.B.C) trigger Python client package build/publish (clients/python)
  • Other library tags (vX.Y.Z-rc.N, vX.Y.Z-dev.N) do not trigger GitHub Release or Python package publish; they only trigger Docker image build/publish
  1. Update CHANGELOG.md with new features and fixes
  2. Update relevant versions (library/API/client) based on what changed
  3. Sync API/client versions if needed (make update-api-version + make sync-client-version)
  4. Create and push tags:
make tag-version
git push origin --tags

Troubleshooting

OpenSSL Errors

Some components use OpenSSL hash functions that aren't available by default in OpenSSL 3.0+ (e.g., ripemd160). This can cause test suite failures. To fix this, enable legacy providers in your OpenSSL configuration. See the "fix openssl legacy mode" step in .github/workflows/run_tests.yaml for an example.

Common Issues

  1. Connection Refused: Verify Cassandra is running and accessible
  2. Schema Validation Errors: Ensure database schema matches expected version
  3. Import Errors: Install with [all] option for complete feature set
  4. Python Version: Requires Python >=3.10, <3.13

Getting Help

License

See LICENSE file for licensing details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run make pre-commit to ensure code quality
  5. Submit a pull request

GraphSense - Open Source Crypto Analytics Platform Website: https://graphsense.github.io/

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphsense_lib-2.10.1.tar.gz (13.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphsense_lib-2.10.1-py3-none-any.whl (819.4 kB view details)

Uploaded Python 3

File details

Details for the file graphsense_lib-2.10.1.tar.gz.

File metadata

  • Download URL: graphsense_lib-2.10.1.tar.gz
  • Upload date:
  • Size: 13.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for graphsense_lib-2.10.1.tar.gz
Algorithm Hash digest
SHA256 1de0bb4e7c0072f2d137052ac686a07b0215625eb82e7dbb9ece265ec4836ff0
MD5 6c8bc55509507d6ef11eda1842191cd6
BLAKE2b-256 6cbd54ed504bb9de9da889d37801111973156ee5e59dac5915536a922ad44e2e

See more details on using hashes here.

File details

Details for the file graphsense_lib-2.10.1-py3-none-any.whl.

File metadata

  • Download URL: graphsense_lib-2.10.1-py3-none-any.whl
  • Upload date:
  • Size: 819.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for graphsense_lib-2.10.1-py3-none-any.whl
Algorithm Hash digest
SHA256 525e28f50a3a6baefab0f21f5b43866274aeec74d0b9b6722e52aed9d16c0943
MD5 1943d7dc44b2ed63b1d574398e5ea20d
BLAKE2b-256 6d5c62a8e726d0232a2dd760be511abe8cefee5d7e1fc9a871a2977593ff3bb7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page