Skip to main content

Shared datastore utilities for GroAI.fi — Binance market data downloading, parquet storage, and trading infrastructure

Project description

groai-fi-datastore-shared

PyPI version Python 3.9+ License: Apache 2.0 CI

Shared datastore utilities for GroAI.fi — Binance market data downloading, partitioned Parquet storage, order execution, and backtesting infrastructure.


Features

  • Market Data Downloading — Incremental Binance OHLCV download with automatic catch-up from where you left off
  • Parquet Storage — Hive-partitioned Parquet files (exchange=X/symbol=Y/part.N.parquet) powered by Dask (local) and DuckDB (S3)
  • File Locking — Safe concurrent writes via .write.lock file with stale-lock detection (Local filesystem only)
  • Order ExecutionBinanceOrder, BinanceClient with spot and margin support
  • BacktestingBacktestOrder, BacktestOrderData for strategy simulation
  • CLI Suite — 10 distinct CLI commands enabling both isolated local-filesystem and cloud-native S3 workflows

Installation

# Using pip
pip install groai-fi-datastore-shared

# Using uv (recommended)
uv add groai-fi-datastore-shared

Environment Variables

The following environment variables are used across modules:

Variable Required Description
BINANCE_API_KEY Yes (trading/download) Your Binance API key
BINANCE_API_SECRET Yes (trading/download) Your Binance API secret
S3_ENDPOINT_URL Yes (if S3) Your object storage endpoint (e.g. https://t3.storageapi.dev)
S3_BUCKET_NAME Yes (if S3) The target bucket name
S3_ACCESS_KEY_ID Yes (if S3) S3 access key ID
S3_SECRET_ACCESS_KEY Yes (if S3) S3 standard secret access key
BINANCE_API_KEY_TEST No Testnet API key
BINANCE_API_SECRET_TEST No Testnet API secret
SEND_MAIL_RECEIVER No Email address for trade alerts

Create a .env file in your project root and load it with python-dotenv or export variables in your shell.


CLI Usage

The package exposes identical pipelines for Local Filesystems and S3 Storage.

Local Storage Commands

Run data engineering workflows directly onto a mounted drive via Dask and PyArrow files:

# 1. Download
binance-download-price --symbol BTCUSDT --tframe 1m --path /path/to/prices_v3.parquet
# 2. Merge shards
binance-merge-parquet --exchange Binance --symbol BTCUSDT --path /path/to/prices_v3.parquet --interval_base 1m
# 3. Auto-catchup all tracked symbols
binance-auto-update --exchange Binance --path /path/to/prices_v3.parquet --tframe 1m
# 4. View tracked symbols
binance-list-symbols --path /path/to/prices_v3.parquet
# 5. Clean / remove
binance-remove-symbol --symbol BTCUSDT --path /path/to/prices_v3.parquet --yes

S3 Storage Commands (Native)

Run structurally identical commands operating directly on object storage leveraging DuckDB HTTPFS. Local storage volumes are not required.

# S3 1. Download
binance-download-price-s3 --symbol BTCUSDT --tframe 1m --bucket my-bucket
# S3 2. Merge shards (Memory optimized via DuckDB)
binance-merge-parquet-s3 --exchange Binance --symbol BTCUSDT --bucket my-bucket
# S3 3. Auto-catchup all tracking
binance-auto-update-s3 --exchange Binance --bucket my-bucket --tframe 1m
# S3 4. View S3 inventory
binance-list-symbols-s3 --bucket my-bucket
# S3 5. Clean / remove
binance-remove-symbol-s3 --symbol BTCUSDT --bucket my-bucket --yes

Python API

# Market data downloading
from groai_fi_datastore_shared.Binance import BinanceMarketDataDownloader
from datetime import datetime

logger = ...  # your logger

BinanceMarketDataDownloader.catchup_price_binance(
    symbol="BTCUSDT",
    kline_tframe="1m",
    default_download_start_date=datetime(2024, 1, 1),
    price_root_dir="/data/prices_v3.parquet",
    logger=logger,
)

# Reading config from environment
from groai_fi_datastore_shared.Binance.config import BinanceConfig
config = BinanceConfig.from_env()

# Trading client
from groai_fi_datastore_shared.Binance import BinanceClient
client = BinanceClient(config)

Development

# Clone
git clone https://github.com/groai-fi/datastore.shared.git
cd datastore.shared

# Install with dev extras
make install-dev

# Run tests
make test

# Lint
make lint

Publishing

# Build wheel + sdist
make build

# Publish to PyPI (requires UV_PUBLISH_TOKEN in env)
make publish

# Publish to TestPyPI first
make publish-test

To release a new version:

  1. Bump the version field in pyproject.toml
  2. Commit and push
  3. Tag: git tag v0.2.0 && git push origin v0.2.0
  4. GitHub Actions will automatically build and publish to PyPI

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

groai_fi_datastore_shared-0.2.2.tar.gz (81.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

groai_fi_datastore_shared-0.2.2-py3-none-any.whl (100.3 kB view details)

Uploaded Python 3

File details

Details for the file groai_fi_datastore_shared-0.2.2.tar.gz.

File metadata

  • Download URL: groai_fi_datastore_shared-0.2.2.tar.gz
  • Upload date:
  • Size: 81.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for groai_fi_datastore_shared-0.2.2.tar.gz
Algorithm Hash digest
SHA256 ff1c800ba08cbcb8ad973e1e9131067208def8ad6095d2419084972de109d2ac
MD5 c0636273b2501dfd34cfa8c76fe31beb
BLAKE2b-256 f73881c827fd02baa5cf8a3ce5e42e50a57a5894b45da1e9bd26ce2593340280

See more details on using hashes here.

File details

Details for the file groai_fi_datastore_shared-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: groai_fi_datastore_shared-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 100.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for groai_fi_datastore_shared-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fc1ae182151c78a889a8ef9faedfc48649833d871d6423d8369c7cf4ca7dc291
MD5 cb6f628951b93819a0bf36f0e13d3a54
BLAKE2b-256 b1fced958423c24394fcae165d59b185d50ec335a7d73df3f2974b15bf1fb3a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page