Skip to main content

Cryptocurrency OHLCV data collection with gap-free guarantee. Retrieves microstructure-enriched kline data from Binance Public Data Repository with automatic gap detection and filling.

Project description

Gapless Crypto Data

PyPI version Python Versions Downloads License: MIT

Cryptocurrency OHLCV data collection with gap-free guarantee. Retrieves microstructure-enriched kline data from Binance Public Data Repository with automatic gap detection and filling.

Installation

# UV (recommended)
uv add gapless-crypto-data

# pip
pip install gapless-crypto-data

Quick Start

import gapless_crypto_data as gcd

# Fetch historical data
df = gcd.download("BTCUSDT", timeframe="1h", start="2024-01-01", end="2024-06-30")

# Fetch recent data with limit
df = gcd.fetch_data("ETHUSDT", timeframe="4h", limit=1000)

# Get available symbols and timeframes
symbols = gcd.get_supported_symbols()
timeframes = gcd.get_supported_timeframes()

# Fill gaps in existing data directory
results = gcd.fill_gaps("./data")

Data Format

Returns pandas DataFrames with microstructure columns:

Column Type Description
date datetime64 Period open timestamp
open, high, low, close float64 OHLC prices
volume float64 Base asset volume
close_time datetime64 Period close timestamp
quote_asset_volume float64 Quote asset volume
number_of_trades int64 Trade count
taker_buy_base_asset_volume float64 Taker buy volume (base)
taker_buy_quote_asset_volume float64 Taker buy volume (quote)

See Data Format Specification for column semantics and constraints.

Supported Timeframes

All Binance spot kline intervals. Query dynamically:

import gapless_crypto_data as gcd
print(gcd.get_supported_timeframes())

API Reference

Function-based API

import gapless_crypto_data as gcd

# Primary collection function
df = gcd.download(symbol, timeframe, start, end)
df = gcd.fetch_data(symbol, timeframe, limit=None, start=None, end=None)

# Gap filling
results = gcd.fill_gaps(directory, symbols=None)

# Discovery
symbols = gcd.get_supported_symbols()
timeframes = gcd.get_supported_timeframes()

Class-based API

from gapless_crypto_data import BinancePublicDataCollector, UniversalGapFiller

# Data collection with full control
collector = BinancePublicDataCollector(
    symbol="BTCUSDT",
    start_date="2024-01-01",
    end_date="2024-12-31"
)
result = collector.collect_timeframe_data("1h")
df = result["dataframe"]

# Gap detection and filling
gap_filler = UniversalGapFiller()
gaps = gap_filler.detect_all_gaps(csv_file, timeframe)
result = gap_filler.process_file(csv_file, timeframe)

Full API documentation: Python API Reference

Data Sources

Source Method Use Case
Binance Public Data Repository Monthly/daily ZIP archives Historical bulk collection
Binance REST API Per-request klines Gap filling, recent data

Collection strategy: Repository archives for bulk historical data, API for gaps and recent periods. See Data Collection Guide.

AI Agent Integration

Programmatic discovery via __probe__ module:

import gapless_crypto_data
probe = gapless_crypto_data.__probe__

# API discovery
probe.discover_api()
probe.get_capabilities()
probe.get_task_graph()

See Probe Usage for AI agent integration patterns.

Development

Setup

git clone https://github.com/terrylica/gapless-crypto-data.git
cd gapless-crypto-data
uv venv && source .venv/bin/activate
uv sync --dev
uv run pre-commit install

Commands

Task Command
Run tests uv run pytest
Format uv run ruff format .
Lint uv run ruff check --fix .
Type check uv run mypy src/
Build uv build

Project Structure

src/gapless_crypto_data/
├── __init__.py          # Package exports
├── api.py               # Function-based API
├── __probe__.py         # AI agent discovery
├── collectors/          # Data collection
├── gap_filling/         # Gap detection/filling
└── validation/          # Data validation

Full development guide: Development Setup

Architecture

  • BinancePublicDataCollector: Bulk data retrieval from public repository
  • UniversalGapFiller: Gap detection and API-based filling
  • AtomicCSVOperations: Corruption-proof file operations
  • ValidationStorage: DuckDB-backed validation persistence

Architecture documentation: Overview

License

MIT License - see LICENSE

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gapless_crypto_data-5.0.0.tar.gz (6.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gapless_crypto_data-5.0.0-py3-none-any.whl (519.5 kB view details)

Uploaded Python 3

File details

Details for the file gapless_crypto_data-5.0.0.tar.gz.

File metadata

  • Download URL: gapless_crypto_data-5.0.0.tar.gz
  • Upload date:
  • Size: 6.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for gapless_crypto_data-5.0.0.tar.gz
Algorithm Hash digest
SHA256 7f854573d052b7e577d023f19bf23fbb7003152db03e1f7c99b879b7f8f6ed9e
MD5 046de7631d73d8b888ab3a7ea20781a9
BLAKE2b-256 1abd1e1f5df795216433cfc5dc78bd776ee5fd5fe776dc638388b8d4e8fb8a7f

See more details on using hashes here.

File details

Details for the file gapless_crypto_data-5.0.0-py3-none-any.whl.

File metadata

  • Download URL: gapless_crypto_data-5.0.0-py3-none-any.whl
  • Upload date:
  • Size: 519.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for gapless_crypto_data-5.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d3f7ca990d3e280b52c7c3fb652a2c2f4416d185b85a31602e88006437bb972b
MD5 4323c35113a024a3818619b47baf60db
BLAKE2b-256 63fb48c6821bbf12a543c41469496204f1fb98061d36f014b826338abf194b40

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page