Shared datastore utilities for GroAI.fi — Binance market data downloading, parquet storage, and trading infrastructure
Project description
groai-fi-datastore-shared
Shared datastore utilities for GroAI.fi — Binance market data downloading, partitioned Parquet storage, order execution, and backtesting infrastructure.
Features
- Market Data Downloading — Incremental Binance OHLCV download with automatic catch-up from where you left off
- Parquet Storage — Hive-partitioned Parquet files (
exchange=X/symbol=Y/part.N.parquet) powered by Dask (local) and DuckDB (S3) - File Locking — Safe concurrent writes via
.write.lockfile with stale-lock detection (Local filesystem only) - Order Execution —
BinanceOrder,BinanceClientwith spot and margin support - Backtesting —
BacktestOrder,BacktestOrderDatafor strategy simulation - CLI Suite — 10 distinct CLI commands enabling both isolated local-filesystem and cloud-native S3 workflows
Installation
# Using pip
pip install groai-fi-datastore-shared
# Using uv (recommended)
uv add groai-fi-datastore-shared
Environment Variables
The following environment variables are used across modules:
| Variable | Required | Description |
|---|---|---|
BINANCE_API_KEY |
Yes (trading/download) | Your Binance API key |
BINANCE_API_SECRET |
Yes (trading/download) | Your Binance API secret |
S3_ENDPOINT_URL |
Yes (if S3) | Your object storage endpoint (e.g. https://t3.storageapi.dev) |
S3_BUCKET_NAME |
Yes (if S3) | The target bucket name |
S3_ACCESS_KEY_ID |
Yes (if S3) | S3 access key ID |
S3_SECRET_ACCESS_KEY |
Yes (if S3) | S3 standard secret access key |
BINANCE_API_KEY_TEST |
No | Testnet API key |
BINANCE_API_SECRET_TEST |
No | Testnet API secret |
SEND_MAIL_RECEIVER |
No | Email address for trade alerts |
Create a .env file in your project root and load it with python-dotenv or export variables in your shell.
CLI Usage
The package exposes identical pipelines for Local Filesystems and S3 Storage.
Local Storage Commands
Run data engineering workflows directly onto a mounted drive via Dask and PyArrow files:
# 1. Download
binance-download-price --symbol BTCUSDT --tframe 1m --path /path/to/prices_v3.parquet
# 2. Merge shards
binance-merge-parquet --exchange Binance --symbol BTCUSDT --path /path/to/prices_v3.parquet --interval_base 1m
# 3. Auto-catchup all tracked symbols
binance-auto-update --exchange Binance --path /path/to/prices_v3.parquet --tframe 1m
# 4. View tracked symbols
binance-list-symbols --path /path/to/prices_v3.parquet
# 5. Clean / remove
binance-remove-symbol --symbol BTCUSDT --path /path/to/prices_v3.parquet --yes
S3 Storage Commands (Native)
Run structurally identical commands operating directly on object storage leveraging DuckDB HTTPFS. Local storage volumes are not required.
# S3 1. Download
binance-download-price-s3 --symbol BTCUSDT --tframe 1m --bucket my-bucket
# S3 2. Merge shards (Memory optimized via DuckDB)
binance-merge-parquet-s3 --exchange Binance --symbol BTCUSDT --bucket my-bucket
# S3 3. Auto-catchup all tracking
binance-auto-update-s3 --exchange Binance --bucket my-bucket --tframe 1m
# S3 4. View S3 inventory
binance-list-symbols-s3 --bucket my-bucket
# S3 5. Clean / remove
binance-remove-symbol-s3 --symbol BTCUSDT --bucket my-bucket --yes
Python API
# Market data downloading
from groai_fi_datastore_shared.Binance import BinanceMarketDataDownloader
from datetime import datetime
logger = ... # your logger
BinanceMarketDataDownloader.catchup_price_binance(
symbol="BTCUSDT",
kline_tframe="1m",
default_download_start_date=datetime(2024, 1, 1),
price_root_dir="/data/prices_v3.parquet",
logger=logger,
)
# Reading config from environment
from groai_fi_datastore_shared.Binance.config import BinanceConfig
config = BinanceConfig.from_env()
# Trading client
from groai_fi_datastore_shared.Binance import BinanceClient
client = BinanceClient(config)
Development
# Clone
git clone https://github.com/groai-fi/datastore.shared.git
cd datastore.shared
# Install with dev extras
make install-dev
# Run tests
make test
# Lint
make lint
Publishing
# Build wheel + sdist
make build
# Publish to PyPI (requires UV_PUBLISH_TOKEN in env)
make publish
# Publish to TestPyPI first
make publish-test
To release a new version:
- Bump the
versionfield inpyproject.toml - Commit and push
- Tag:
git tag v0.2.0 && git push origin v0.2.0 - GitHub Actions will automatically build and publish to PyPI
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters