Skip to main content

Python SDK for accessing Solana blockchain data from solarchive.org - free Parquet datasets of transactions, accounts, and tokens

Project description

solarchive

Python SDK for accessing Solana blockchain data from solarchive.org.

solarchive is a project to archive Solana's public transaction data and make it freely accessible in ergonomic formats (Apache Parquet) for developers, researchers, and the entire Solana community.

Features

  • Download Solana transaction, account, and token data in Parquet format
  • Access historical data from 2020 to present
  • No API keys or rate limits - direct HTTP access to data files
  • Query data with DuckDB, pandas, Spark, or any Parquet-compatible tool
  • Browse available datasets via index files
  • Licensed under CC-BY-4.0

Installation

pip install solarchive

Usage

from solarchive import SolArchive

# Initialize the client
client = SolArchive()

# List available transaction dates
dates = client.list_transaction_dates()
print(f"Available dates: {dates[:5]}...")

# Get index for a specific date
index = client.get_transaction_index("2025-11-01")
print(f"Files available: {len(index['files'])}")

# Download transaction data for a date
# Downloads all Parquet files for that date
client.download_transactions("2025-11-01", output_dir="./data/txs")

# Get account snapshots for a month
client.download_accounts("2025-12", output_dir="./data/accounts")

# Get token snapshots for a month
client.download_tokens("2024-09", output_dir="./data/tokens")

Data Schema

The archive contains three main datasets:

  1. Transactions - All non-vote transactions with signatures, fees, account changes, and token balances

    • Schema: https://data.solarchive.org/schemas/solana/transactions.json
    • Partitioned by day: txs/YYYY-MM-DD/*.parquet
  2. Accounts - Periodic snapshots of account states including balances and program data

    • Schema: https://data.solarchive.org/schemas/solana/accounts.json
    • Partitioned by month: accounts/YYYY-MM/*.parquet
  3. Tokens - Metadata snapshots for fungible and non-fungible tokens

    • Schema: https://data.solarchive.org/schemas/solana/tokens.json
    • Partitioned by month: tokens/YYYY-MM/*.parquet

Development

This package requires Python 3.11 or higher.

# Install dependencies
pip install -e .

# Run the example
python main.py

About solarchive.org

Visit solarchive.org to explore the data directly in your browser using DuckDB-WASM, or to support the project. Hosting hundreds of terabytes costs nearly $10,000/year!

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

solarchive-0.0.1.tar.gz (3.6 kB view details)

Uploaded Source

File details

Details for the file solarchive-0.0.1.tar.gz.

File metadata

  • Download URL: solarchive-0.0.1.tar.gz
  • Upload date:
  • Size: 3.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.14 {"installer":{"name":"uv","version":"0.9.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for solarchive-0.0.1.tar.gz
Algorithm Hash digest
SHA256 547e0639cefabdb59321f0698d99e7dd2b9fe3a1c09870f805ceaec5947de92d
MD5 f6dd23115f4797f8ede3d77c893beb32
BLAKE2b-256 e126f89558e0a69bcfa1b4701228310b2bf149f89e7e0d4800458499bf193be1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page