Cryptocurrency OHLCV data collection with gap-free guarantee. Retrieves microstructure-enriched kline data from Binance Public Data Repository with automatic gap detection and filling.
Project description
Gapless Crypto Data
Cryptocurrency OHLCV data collection with gap-free guarantee. Retrieves microstructure-enriched kline data from Binance Public Data Repository with automatic gap detection and filling.
Installation
# UV (recommended)
uv add gapless-crypto-data
# pip
pip install gapless-crypto-data
Quick Start
import gapless_crypto_data as gcd
# Fetch historical data
df = gcd.download("BTCUSDT", timeframe="1h", start="2024-01-01", end="2024-06-30")
# Fetch recent data with limit
df = gcd.fetch_data("ETHUSDT", timeframe="4h", limit=1000)
# Get available symbols and timeframes
symbols = gcd.get_supported_symbols()
timeframes = gcd.get_supported_timeframes()
# Fill gaps in existing data directory
results = gcd.fill_gaps("./data")
Data Format
Returns pandas DataFrames with microstructure columns:
| Column | Type | Description |
|---|---|---|
date |
datetime64 | Period open timestamp |
open, high, low, close |
float64 | OHLC prices |
volume |
float64 | Base asset volume |
close_time |
datetime64 | Period close timestamp |
quote_asset_volume |
float64 | Quote asset volume |
number_of_trades |
int64 | Trade count |
taker_buy_base_asset_volume |
float64 | Taker buy volume (base) |
taker_buy_quote_asset_volume |
float64 | Taker buy volume (quote) |
See Data Format Specification for column semantics and constraints.
Supported Timeframes
All Binance spot kline intervals. Query dynamically:
import gapless_crypto_data as gcd
print(gcd.get_supported_timeframes())
API Reference
Function-based API
import gapless_crypto_data as gcd
# Primary collection function
df = gcd.download(symbol, timeframe, start, end)
df = gcd.fetch_data(symbol, timeframe, limit=None, start=None, end=None)
# Gap filling
results = gcd.fill_gaps(directory, symbols=None)
# Discovery
symbols = gcd.get_supported_symbols()
timeframes = gcd.get_supported_timeframes()
Class-based API
from gapless_crypto_data import BinancePublicDataCollector, UniversalGapFiller
# Data collection with full control
collector = BinancePublicDataCollector(
symbol="BTCUSDT",
start_date="2024-01-01",
end_date="2024-12-31"
)
result = collector.collect_timeframe_data("1h")
df = result["dataframe"]
# Gap detection and filling
gap_filler = UniversalGapFiller()
gaps = gap_filler.detect_all_gaps(csv_file, timeframe)
result = gap_filler.process_file(csv_file, timeframe)
Full API documentation: Python API Reference
Data Sources
| Source | Method | Use Case |
|---|---|---|
| Binance Public Data Repository | Monthly/daily ZIP archives | Historical bulk collection |
| Binance REST API | Per-request klines | Gap filling, recent data |
Collection strategy: Repository archives for bulk historical data, API for gaps and recent periods. See Data Collection Guide.
AI Agent Integration
Programmatic discovery via __probe__ module:
import gapless_crypto_data
probe = gapless_crypto_data.__probe__
# API discovery
probe.discover_api()
probe.get_capabilities()
probe.get_task_graph()
See Probe Usage for AI agent integration patterns.
Development
Setup
git clone https://github.com/terrylica/gapless-crypto-data.git
cd gapless-crypto-data
uv venv && source .venv/bin/activate
uv sync --dev
uv run pre-commit install
Commands
| Task | Command |
|---|---|
| Run tests | uv run pytest |
| Format | uv run ruff format . |
| Lint | uv run ruff check --fix . |
| Type check | uv run mypy src/ |
| Build | uv build |
Project Structure
src/gapless_crypto_data/
├── __init__.py # Package exports
├── api.py # Function-based API
├── __probe__.py # AI agent discovery
├── collectors/ # Data collection
├── gap_filling/ # Gap detection/filling
└── validation/ # Data validation
Full development guide: Development Setup
Architecture
- BinancePublicDataCollector: Bulk data retrieval from public repository
- UniversalGapFiller: Gap detection and API-based filling
- AtomicCSVOperations: Corruption-proof file operations
- ValidationStorage: DuckDB-backed validation persistence
Architecture documentation: Overview
License
MIT License - see LICENSE
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gapless_crypto_data-4.0.3.tar.gz.
File metadata
- Download URL: gapless_crypto_data-4.0.3.tar.gz
- Upload date:
- Size: 6.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80dc01fc7fa4dc018e604f9b1696297589836ce6df77a58924154ee26f623c5c
|
|
| MD5 |
2ae20a4c94ab66ec8bca6bf3e6cb8a8f
|
|
| BLAKE2b-256 |
2bc084aa2f9ca879eed0798b3dad3a1d03ec736360f5bc86b1f43161a84e5ce1
|
File details
Details for the file gapless_crypto_data-4.0.3-py3-none-any.whl.
File metadata
- Download URL: gapless_crypto_data-4.0.3-py3-none-any.whl
- Upload date:
- Size: 519.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2838a3e122c139c7bcf49f7ff79ffb1e04701744fbbc475237b127fd9e013dd3
|
|
| MD5 |
6db91dac86c2b6bc12703d264f12623d
|
|
| BLAKE2b-256 |
ca5db0d9f5096245050a289363feb7d332786d82757d278099f89521bb4864be
|