Skip to main content

Shared datastore utilities for GroAI.fi — Binance market data downloading, parquet storage, and trading infrastructure

Project description

groai-fi-datastore-shared

PyPI version Python 3.9+ License: Apache 2.0 CI

Shared datastore utilities for GroAI.fi — Binance market data downloading, partitioned Parquet storage, order execution, and backtesting infrastructure.


Features

  • Market Data Downloading — Incremental Binance OHLCV download with automatic catch-up from where you left off
  • Parquet Storage — Hive-partitioned Parquet files (exchange=X/symbol=Y/part.N.parquet) powered by Dask (local) and DuckDB (S3)
  • File Locking — Safe concurrent writes via .write.lock file with stale-lock detection (Local filesystem only)
  • Order ExecutionBinanceOrder, BinanceClient with spot and margin support
  • BacktestingBacktestOrder, BacktestOrderData for strategy simulation
  • CLI Suite — 10 distinct CLI commands enabling both isolated local-filesystem and cloud-native S3 workflows

Installation

# Using pip
pip install groai-fi-datastore-shared

# Using uv (recommended)
uv add groai-fi-datastore-shared

Environment Variables

The following environment variables are used across modules:

Variable Required Description
BINANCE_API_KEY Yes (trading/download) Your Binance API key
BINANCE_API_SECRET Yes (trading/download) Your Binance API secret
S3_ENDPOINT_URL Yes (if S3) Your object storage endpoint (e.g. https://t3.storageapi.dev)
S3_BUCKET_NAME Yes (if S3) The target bucket name
S3_ACCESS_KEY_ID Yes (if S3) S3 access key ID
S3_SECRET_ACCESS_KEY Yes (if S3) S3 standard secret access key
BINANCE_API_KEY_TEST No Testnet API key
BINANCE_API_SECRET_TEST No Testnet API secret
SEND_MAIL_RECEIVER No Email address for trade alerts

Create a .env file in your project root and load it with python-dotenv or export variables in your shell.


CLI Usage

The package exposes identical pipelines for Local Filesystems and S3 Storage.

Local Storage Commands

Run data engineering workflows directly onto a mounted drive via Dask and PyArrow files:

# 1. Download
binance-download-price --symbol BTCUSDT --tframe 1m --path /path/to/prices_v3.parquet
# 2. Merge shards
binance-merge-parquet --exchange Binance --symbol BTCUSDT --path /path/to/prices_v3.parquet --interval_base 1m
# 3. Auto-catchup all tracked symbols
binance-auto-update --exchange Binance --path /path/to/prices_v3.parquet --tframe 1m
# 4. View tracked symbols
binance-list-symbols --path /path/to/prices_v3.parquet
# 5. Clean / remove
binance-remove-symbol --symbol BTCUSDT --path /path/to/prices_v3.parquet --yes

S3 Storage Commands (Native)

Run structurally identical commands operating directly on object storage leveraging DuckDB HTTPFS. Local storage volumes are not required.

# S3 1. Download
binance-download-price-s3 --symbol BTCUSDT --tframe 1m --bucket my-bucket
# S3 2. Merge shards (Memory optimized via DuckDB)
binance-merge-parquet-s3 --exchange Binance --symbol BTCUSDT --bucket my-bucket
# S3 3. Auto-catchup all tracking
binance-auto-update-s3 --exchange Binance --bucket my-bucket --tframe 1m
# S3 4. View S3 inventory
binance-list-symbols-s3 --bucket my-bucket
# S3 5. Clean / remove
binance-remove-symbol-s3 --symbol BTCUSDT --bucket my-bucket --yes

Python API

# Market data downloading
from groai_fi_datastore_shared.Binance import BinanceMarketDataDownloader
from datetime import datetime

logger = ...  # your logger

BinanceMarketDataDownloader.catchup_price_binance(
    symbol="BTCUSDT",
    kline_tframe="1m",
    default_download_start_date=datetime(2024, 1, 1),
    price_root_dir="/data/prices_v3.parquet",
    logger=logger,
)

# Reading config from environment
from groai_fi_datastore_shared.Binance.config import BinanceConfig
config = BinanceConfig.from_env()

# Trading client
from groai_fi_datastore_shared.Binance import BinanceClient
client = BinanceClient(config)

Development

# Clone
git clone https://github.com/groai-fi/datastore.shared.git
cd datastore.shared

# Install with dev extras
make install-dev

# Run tests
make test

# Lint
make lint

Publishing

# Build wheel + sdist
make build

# Publish to PyPI (requires UV_PUBLISH_TOKEN in env)
make publish

# Publish to TestPyPI first
make publish-test

To release a new version:

  1. Bump the version field in pyproject.toml
  2. Commit and push
  3. Tag: git tag v0.2.0 && git push origin v0.2.0
  4. GitHub Actions will automatically build and publish to PyPI

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

groai_fi_datastore_shared-0.2.3.tar.gz (81.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

groai_fi_datastore_shared-0.2.3-py3-none-any.whl (100.3 kB view details)

Uploaded Python 3

File details

Details for the file groai_fi_datastore_shared-0.2.3.tar.gz.

File metadata

  • Download URL: groai_fi_datastore_shared-0.2.3.tar.gz
  • Upload date:
  • Size: 81.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for groai_fi_datastore_shared-0.2.3.tar.gz
Algorithm Hash digest
SHA256 33372f250ceffd6b503b6a6662844450519706b5ce35588b0deb17f66aa07db5
MD5 47b62d1a144f2a0fe16ffea5fdabe9c4
BLAKE2b-256 af5a3ecb8a5a163b59ab1b1b09604028aeba253f54f6d0e144d950b9da811705

See more details on using hashes here.

File details

Details for the file groai_fi_datastore_shared-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: groai_fi_datastore_shared-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 100.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for groai_fi_datastore_shared-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9f3e8303a4abece3549ba03a8d93917076af6325179c417c6dc48609402799a9
MD5 b1e79b333eae00f32edc264007e345ec
BLAKE2b-256 d3792b7f5b4851da734520a8346b376cba5b61c3a9f6de70e8e7b738a7216fcd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page