Skip to main content

Shared datastore utilities for GroAI.fi — Binance market data downloading, parquet storage, and trading infrastructure

Project description

groai-fi-datastore-shared

PyPI version Python 3.9+ License: Apache 2.0 CI

Shared datastore utilities for GroAI.fi — Binance market data downloading, partitioned Parquet storage, order execution, and backtesting infrastructure.


Features

  • Market Data Downloading — Incremental Binance OHLCV download with automatic catch-up from where you left off
  • Parquet Storage — Hive-partitioned Parquet files (exchange=X/symbol=Y/part.N.parquet) powered by Dask (local) and DuckDB (S3)
  • File Locking — Safe concurrent writes via .write.lock file with stale-lock detection (Local filesystem only)
  • Order ExecutionBinanceOrder, BinanceClient with spot and margin support
  • BacktestingBacktestOrder, BacktestOrderData for strategy simulation
  • CLI Suite — 10 distinct CLI commands enabling both isolated local-filesystem and cloud-native S3 workflows

Installation

# Using pip
pip install groai-fi-datastore-shared

# Using uv (recommended)
uv add groai-fi-datastore-shared

Environment Variables

The following environment variables are used across modules:

Variable Required Description
BINANCE_API_KEY Yes (trading/download) Your Binance API key
BINANCE_API_SECRET Yes (trading/download) Your Binance API secret
S3_ENDPOINT_URL Yes (if S3) Your object storage endpoint (e.g. https://t3.storageapi.dev)
S3_BUCKET_NAME Yes (if S3) The target bucket name
S3_ACCESS_KEY_ID Yes (if S3) S3 access key ID
S3_SECRET_ACCESS_KEY Yes (if S3) S3 standard secret access key
BINANCE_API_KEY_TEST No Testnet API key
BINANCE_API_SECRET_TEST No Testnet API secret
SEND_MAIL_RECEIVER No Email address for trade alerts

Create a .env file in your project root and load it with python-dotenv or export variables in your shell.


CLI Usage

The package exposes identical pipelines for Local Filesystems and S3 Storage.

Local Storage Commands

Run data engineering workflows directly onto a mounted drive via Dask and PyArrow files:

# 1. Download
binance-download-price --symbol BTCUSDT --tframe 1m --path /path/to/prices_v3.parquet
# 2. Merge shards
binance-merge-parquet --exchange Binance --symbol BTCUSDT --path /path/to/prices_v3.parquet --interval_base 1m
# 3. Auto-catchup all tracked symbols
binance-auto-update --exchange Binance --path /path/to/prices_v3.parquet --tframe 1m
# 4. View tracked symbols
binance-list-symbols --path /path/to/prices_v3.parquet
# 5. Clean / remove
binance-remove-symbol --symbol BTCUSDT --path /path/to/prices_v3.parquet --yes

S3 Storage Commands (Native)

Run structurally identical commands operating directly on object storage leveraging DuckDB HTTPFS. Local storage volumes are not required.

# S3 1. Download
binance-download-price-s3 --symbol BTCUSDT --tframe 1m --bucket my-bucket
# S3 2. Merge shards (Memory optimized via DuckDB)
binance-merge-parquet-s3 --exchange Binance --symbol BTCUSDT --bucket my-bucket
# S3 3. Auto-catchup all tracking
binance-auto-update-s3 --exchange Binance --bucket my-bucket --tframe 1m
# S3 4. View S3 inventory
binance-list-symbols-s3 --bucket my-bucket
# S3 5. Clean / remove
binance-remove-symbol-s3 --symbol BTCUSDT --bucket my-bucket --yes

Python API

# Market data downloading
from groai_fi_datastore_shared.Binance import BinanceMarketDataDownloader
from datetime import datetime

logger = ...  # your logger

BinanceMarketDataDownloader.catchup_price_binance(
    symbol="BTCUSDT",
    kline_tframe="1m",
    default_download_start_date=datetime(2024, 1, 1),
    price_root_dir="/data/prices_v3.parquet",
    logger=logger,
)

# Reading config from environment
from groai_fi_datastore_shared.Binance.config import BinanceConfig
config = BinanceConfig.from_env()

# Trading client
from groai_fi_datastore_shared.Binance import BinanceClient
client = BinanceClient(config)

Development

# Clone
git clone https://github.com/groai-fi/datastore.shared.git
cd datastore.shared

# Install with dev extras
make install-dev

# Run tests
make test

# Lint
make lint

Publishing

# Build wheel + sdist
make build

# Publish to PyPI (requires UV_PUBLISH_TOKEN in env)
make publish

# Publish to TestPyPI first
make publish-test

To release a new version:

  1. Bump the version field in pyproject.toml
  2. Commit and push
  3. Tag: git tag v0.2.0 && git push origin v0.2.0
  4. GitHub Actions will automatically build and publish to PyPI

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

groai_fi_datastore_shared-0.2.4.tar.gz (85.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

groai_fi_datastore_shared-0.2.4-py3-none-any.whl (105.5 kB view details)

Uploaded Python 3

File details

Details for the file groai_fi_datastore_shared-0.2.4.tar.gz.

File metadata

  • Download URL: groai_fi_datastore_shared-0.2.4.tar.gz
  • Upload date:
  • Size: 85.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for groai_fi_datastore_shared-0.2.4.tar.gz
Algorithm Hash digest
SHA256 d8fb36e93411e0b2618d26d02827924bb39dfcfbe363ee7687391dcbfc65b10b
MD5 15911ac7a81224408c8807a41b2df8d5
BLAKE2b-256 a0a0744c1c9d27dc44b446d898fa032a45583cf9f712794c91ff3c99d78a7486

See more details on using hashes here.

File details

Details for the file groai_fi_datastore_shared-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: groai_fi_datastore_shared-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 105.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for groai_fi_datastore_shared-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9b10cb7e2e1ee88382df6e9982fa9c763827f56a3c7b8428620efb6d0b0faa80
MD5 c50c96da866619f678469eaf5fcb4d6f
BLAKE2b-256 a5ed32bac6591ef53243342c7e95cf46817b2c1d8fd8b973bee8720df82022ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page