Python library for acquiring, storing, transforming, and validating market data
Project description
Data Layer (qldata)
Data Layer (qldata) is a high-performance, production-grade Python library designed for acquiring, storing, transforming, and validating financial market data. It provides a unified interface for interacting with various data sources, including crypto exchanges (Binance, Bybit) and local storage (DuckDB, Parquet).
🚀 Key Features
- Unified Data Interface: Seamlessly switch between live exchange feeds and historical data.
- High Performance: Built on
pandas,numpy, andpyarrowfor efficient data manipulation. - Storage Optimized: Integrated with
DuckDBfor fast, analytical SQL queries on large datasets. - Exchange Support: Native adapters for Binance and Bybit with shared rate limiting and retry logic.
- Automatic Chunking: Transparently handles multi-year historical data requests by auto-splitting into optimal chunks.
- Smart Error Handling: Specific exception types (
RateLimitError,NetworkError,ServerError) with automatic retry mechanisms. - Metadata Tracking: Automatic tracking of dataset freshness, coverage, and quality for smart caching.
- Live Streaming: Robust WebSocket support with automatic reconnection and error handling.
- Type Safe: Fully typed codebase using modern Python type hinting.
- Production Ready: Comprehensive error handling, logging, and retry mechanisms via
tenacity.
🛠️ Installation
Requires Python 3.10+.
pip install qldata
For a minimal install (core data structures only):
pip install qldata[minimal]
For development dependencies:
pip install qldata[dev]
⚡ Quick Start
Fetching Historical Data
The primary entry point for historical data is qd.data().
import qldata as qd
# Fetch last 30 hours of 1-hour klines for BTCUSDT from Binance
df = qd.data("BTCUSDT", source="binance").last(30).resolution("1h").get()
print(df.head())
# Fetch multi-year data - automatically chunked!
# This works seamlessly even for 2+ years of 1-minute data (>1M bars)
df_long = qd.data("BTCUSDT", source="binance").between("2023-01-01", "2025-01-01").resolution("1h").get()
print(f"Fetched {len(df_long)} bars")
Loading from Local Storage
Local stores use the source="local" alias and work with naive timestamps for convenience.
import qldata as qd
from qldata.config import get_config
# Point storage to a directory (Parquet by default)
qd.config(data_dir="./data", store_type="parquet")
# Load previously stored bars
local_df = qd.data("BTCUSDT", source="local").resolution("1h").last(48).get()
print(local_df.tail())
Working with Metadata
Check dataset information and freshness:
from qldata.stores.files import ParquetStore
store = ParquetStore("./data")
# List all tracked datasets
for meta in store.list_metadata():
print(f"{meta.symbol} ({meta.timeframe}): {meta.record_count} bars")
print(f" Range: {meta.first_timestamp} to {meta.last_timestamp}")
print(f" Stale: {meta.is_stale(max_age_hours=24)}")
Handling Errors
from qldata.errors import RateLimitError, NetworkError
try:
df = qd.data("BTCUSDT", source="binance").last(100).resolution("1m").get()
except RateLimitError:
print("Rate limited - automatic retry will handle this")
except NetworkError as e:
print(f"Network issue: {e}")
Streaming Live Data
For real-time data, use qd.stream().
import qldata as qd
import asyncio
async def handler(msg):
print(msg)
# Stream live ticks
stream = qd.stream(["BTCUSDT"], source="binance").resolution("tick").on_data(handler).get()
# Note: In a real async application, you would await the stream session
# await stream.start()
🏗️ Architecture
qldata is built with a modular architecture:
- Core Models: Fundamental data structures and types.
- Adapters: Exchange-specific broker adapters (
qldata/adapters/brokers/*.py) that share rate-limiters/clients. - Stores: Persistence layer for files/DBs with metadata sidecars and deduplication.
- API/Queries:
qd.data()/qd.stream()builders that route through adapters or local stores. - Resilience/Transforms: Retry, chunking, validation, and cleaning utilities used by adapters and queries.
🤝 Development
This is an internal project. Please follow the guidelines below for development.
- Environment: Ensure you are using the correct Python version (3.10+).
- Testing: Run
pytestbefore pushing any changes. - Linting: Use
ruffandblackto maintain code quality. - Documentation: Update docstrings and
mkdocsfiles when modifying APIs.
See the Developer Guide for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file qldata-0.2.0.tar.gz.
File metadata
- Download URL: qldata-0.2.0.tar.gz
- Upload date:
- Size: 85.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f773bc343e78bb89224ef84d0542ba9b37f7bc95512d8851d439db9a7d409b01
|
|
| MD5 |
9cabe5fb8fc5a2c16a30c05b871840f3
|
|
| BLAKE2b-256 |
c6d93b9c6fa1581f35ab55bf9abd7abce601317dd37bf4adafe45b66ed0ef1d4
|
File details
Details for the file qldata-0.2.0-py3-none-any.whl.
File metadata
- Download URL: qldata-0.2.0-py3-none-any.whl
- Upload date:
- Size: 110.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d2172060701fe642f9faa922d5090299c6c2d613f06baa67c90826b10a2153d
|
|
| MD5 |
a972f13fb7e2b7e464568e542089572e
|
|
| BLAKE2b-256 |
717263e017a06ea4f04a94b6dd03b660b849a9ddd8dca94412ab015a7ff7125f
|