Skip to main content

High-performance library for fetching, storing, and streaming historical cryptocurrency market data

Project description

wickdata

High-performance Python library for fetching, storing, and streaming historical cryptocurrency market data.

Features

  • 📊 Historical Data Fetching - Fetch OHLCV candle data from 100+ cryptocurrency exchanges
  • 💾 Efficient Storage - SQLite database with automatic deduplication and indexing
  • 🚀 Streaming Capabilities - Memory-efficient streaming for large datasets
  • 🔍 Intelligent Gap Detection - Automatically identify and fill missing data periods
  • 📈 Progress Tracking - Real-time progress updates for long-running operations
  • 🔄 Retry Logic - Automatic retries with exponential backoff for failed requests
  • 🏗️ Builder Patterns - Intuitive APIs for constructing queries and requests
  • Async/Await - Full async support for high-performance operations
  • 🔌 Exchange Support - Built on CCXT for compatibility with 100+ exchanges
  • Comprehensive Testing - 90%+ test coverage ensuring reliability and stability

Installation

pip install wickdata

Or install from source:

git clone https://github.com/h2337/wickdata.git
cd wickdata
pip install -e .

Quick Start

import asyncio
from wickdata import WickData, DataRequestBuilder, create_binance_config

async def main():
    # Configure exchanges
    exchange_configs = {
        'binance': create_binance_config()  # API keys optional for public data
    }
    
    # Initialize WickData
    async with WickData(exchange_configs) as wickdata:
        # Get data manager
        data_manager = wickdata.get_data_manager()
        
        # Build request for last 7 days of BTC/USDT hourly data
        request = (DataRequestBuilder.create()
            .with_exchange('binance')
            .with_symbol('BTC/USDT')
            .with_timeframe('1h')
            .with_last_days(7)
            .build())
        
        # Fetch with progress tracking
        def on_progress(info):
            print(f"{info.stage}: {info.percentage:.1f}%")
        
        stats = await data_manager.fetch_historical_data(request, on_progress)
        print(f"Fetched {stats.total_candles} candles")

asyncio.run(main())

Check examples/ directory for more examples.

Core Components

WickData

Main entry point for the library. Manages database, exchanges, and provides access to data operations.

DataManager

Handles fetching, storing, and querying historical data with intelligent gap detection.

DataStreamer

Provides memory-efficient streaming of large datasets with various output options.

Builder Patterns

  • DataRequestBuilder - Build data fetch requests with convenient methods
  • CandleQueryBuilder - Construct database queries with fluent interface

Supported Timeframes

  • 1m, 3m, 5m, 15m, 30m (minutes)
  • 1h, 2h, 4h, 6h, 8h, 12h (hours)
  • 1d, 3d (days)
  • 1w (week)
  • 1M (month)

Configuration

Exchange Configuration

from wickdata import create_binance_config, create_coinbase_config

# Binance
binance_config = create_binance_config(
    api_key='your-api-key',  # Optional for public data
    secret='your-secret',     # Optional for public data
    testnet=False
)

# Coinbase
coinbase_config = create_coinbase_config(
    api_key='your-api-key',
    secret='your-secret',
    passphrase='your-passphrase',
    sandbox=False
)

Database Configuration

from wickdata.models.config import DatabaseConfig, WickDataConfig

config = WickDataConfig(
    exchanges={'binance': binance_config},
    database=DatabaseConfig(
        provider='sqlite',
        url='sqlite:///my_data.db'
    ),
    log_level='INFO'
)

Examples

Fetching Historical Data

# Using convenience methods
request = (DataRequestBuilder.create()
    .with_exchange('binance')
    .with_symbol('ETH/USDT')
    .with_timeframe('4h')
    .with_last_weeks(2)  # Last 2 weeks
    .build())

# Or specific date range
request = (DataRequestBuilder.create()
    .with_exchange('binance')
    .with_symbol('BTC/USDT')
    .with_timeframe('1d')
    .with_date_range('2024-01-01', '2024-01-31')
    .build())

Querying Stored Data

# Create query builder
query = CandleQueryBuilder(repository)

# Get recent data with pagination
candles = await (query
    .exchange('binance')
    .symbol('BTC/USDT')
    .timeframe(Timeframe.ONE_HOUR)
    .date_range(start_date, end_date)
    .limit(100)
    .offset(0)
    .execute())

# Get statistics
stats = await query.stats()

Streaming Data

# Stream with async generator
async for batch in data_streamer.stream_candles(
    exchange='binance',
    symbol='ETH/USDT',
    timeframe=Timeframe.FIVE_MINUTES,
    start_time=start_timestamp,
    end_time=end_timestamp,
    options=StreamOptions(batch_size=1000, delay_ms=100)
):
    process_batch(batch)

# Stream to callback
await data_streamer.stream_to_callback(
    exchange='binance',
    symbol='BTC/USDT',
    timeframe=Timeframe.ONE_HOUR,
    start_date=start_date,
    end_date=end_date,
    callback=process_candles,
    options=StreamOptions(batch_size=500)
)

Gap Detection and Analysis

# Find missing data
gaps = await data_manager.find_missing_data(
    exchange='binance',
    symbol='BTC/USDT',
    timeframe=Timeframe.ONE_HOUR,
    start_date=start_date,
    end_date=end_date
)

print(f"Found {len(gaps)} gaps")
for gap in gaps:
    print(f"  Gap: {gap.get_start_datetime()} to {gap.get_end_datetime()}")
    print(f"  Missing candles: {gap.candle_count}")

Error Handling

WickData provides comprehensive error handling with specific exception types:

from wickdata import (
    WickDataError,      # Base error class
    ExchangeError,      # Exchange-specific errors
    ValidationError,    # Input validation errors
    RateLimitError,     # Rate limiting errors
    NetworkError,       # Network connectivity issues
    DatabaseError,      # Database operation errors
    ConfigurationError, # Configuration problems
    DataGapError       # Gap-related errors
)

try:
    await data_manager.fetch_historical_data(request)
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except ExchangeError as e:
    print(f"Exchange error: {e.message}")

Performance

  • Batch Processing: Insert 10,000+ candles per second (SQLite)
  • Concurrent Fetching: Multiple concurrent fetchers per exchange
  • Memory Efficient: Streaming prevents memory overflow for large datasets
  • Smart Caching: Automatic deduplication and gap detection
  • Connection Pooling: Efficient database connection management

Development

Setup Development Environment

# Clone repository
git clone https://github.com/h2337/wickdata.git
cd wickdata

# Install in development mode with dev dependencies
pip install -e ".[dev]"

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=wickdata

# Run specific test file
pytest tests/test_data_manager.py

Code Quality

# Format code
black wickdata

# Lint code
ruff check wickdata

# Type checking
mypy wickdata

Architecture

WickData follows a modular architecture with clear separation of concerns:

  • Core Layer: Main WickData class, DataManager, DataStreamer
  • Database Layer: Repository pattern with SQLite implementation
  • Exchange Layer: CCXT integration with adapter pattern
  • Service Layer: Gap analysis, retry logic, validation
  • Models: Data models with validation
  • Builders: Fluent interfaces for complex object construction

Contributing

Contributions are welcome!

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wickdata-0.1.0.tar.gz (68.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wickdata-0.1.0-py3-none-any.whl (49.2 kB view details)

Uploaded Python 3

File details

Details for the file wickdata-0.1.0.tar.gz.

File metadata

  • Download URL: wickdata-0.1.0.tar.gz
  • Upload date:
  • Size: 68.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for wickdata-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bfc9f2e1270d5cf4364e15bd2f7329b8f10f50c347c03e33a8bc334e40dac5ac
MD5 698fb3c6904d7cb75eecd5f04f059f03
BLAKE2b-256 f0d2393eb23648d960903577f3605b6023b8b7bb6e0bc62d2261cd08a65f738f

See more details on using hashes here.

File details

Details for the file wickdata-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: wickdata-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 49.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for wickdata-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6d02e808faf4b0c6905ad3131f3e99eee36cfc5812f458e5e2abaf77342ff6b8
MD5 329ad63342ae08036abc21fb220a3c29
BLAKE2b-256 e64eba20e9417a186f13898e5b6ff668b0c4c7a3eee21b66df0b3aee6166326a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page