Skip to main content

Download and cache historical market data from Databento

Project description

dbn-cache

Download and cache historical market data from Databento.

Installation

As a library

uv add dbn-cache
# or
pip install dbn-cache

CLI only (global install)

uv tool install dbn-cache
# or
pipx install dbn-cache
# or
mise use -g pipx:dbn-cache

Configuration

Set your Databento API key:

export DATABENTO_API_KEY=db-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Optionally configure cache location:

export DATABENTO_CACHE_DIR=/path/to/cache

Default cache locations:

  • Unix/Mac: ~/.databento
  • Windows: %LOCALAPPDATA%\databento

CLI Usage

The CLI is available as dbn (or dbn-cache):

# Show help
dbn -h
dbn download -h

# Download E-mini S&P 500 continuous futures (1-minute OHLCV)
dbn download ES.c.0 --schema ohlcv-1m --start 2024-01-01 --end 2024-12-01

# Download specific contract
dbn download ESZ24 --schema trades --start 2024-11-01 --end 2024-12-01

# Download from different dataset (default: GLBX.MDP3)
dbn download AAPL --schema trades --start 2024-01-01 --end 2024-01-31 -d XNAS.ITCH

# Update cached data to yesterday (historical data has 24h delay)
dbn update ES.c.0                # Update all schemas for symbol
dbn update ES.c.0 -s ohlcv-1m    # Update specific schema
dbn update --all                  # Update everything in cache

# List cached data
dbn list

# Show info for specific symbol
dbn info ES.c.0 --schema ohlcv-1m

# Show data quality issues
dbn quality ES.c.0 --schema ohlcv-1m

# Estimate cost before downloading
dbn cost ES.c.0 --schema trades --start 2024-01-01 --end 2024-12-01

# Verify cache integrity (check for missing files)
dbn verify
dbn verify --fix  # Remove stale metadata for missing files

# Reference commands
dbn datasets  # List available datasets
dbn schemas   # List available schemas
dbn symbols   # Show symbol format examples

Shell Completions

# Zsh (add to .zshrc)
eval "$(dbn completions zsh)"

# Bash (add to .bashrc)
eval "$(dbn completions bash)"

# Fish
dbn completions fish > ~/.config/fish/completions/dbn.fish

# PowerShell (Windows)
dbn completions powershell >> $PROFILE

Cancellation & Error Handling

  • Press Ctrl+C to cancel gracefully; partial downloads are saved and can be resumed
  • All errors are caught and displayed with clear messages (no unhandled exceptions)

Library Usage

from datetime import date
from dbn_cache import DataCache

# Initialize cache (uses ~/.databento by default)
cache = DataCache()

# Download and cache data
data = cache.download("ES.c.0", "ohlcv-1m", date(2024, 1, 1), date(2024, 12, 1))

# Get as Polars LazyFrame
df = data.to_polars().collect()

# Or as Pandas DataFrame
df = data.to_pandas()

# Ensure data is cached (downloads only if missing)
data = cache.ensure("ES.c.0", "ohlcv-1m", date(2024, 1, 1), date(2024, 12, 1))

# Update cached data to yesterday (returns None if already up to date)
data = cache.update("ES.c.0", "ohlcv-1m")  # Dataset inferred from cache

# Update all cached data
result = cache.update_all()
print(f"Updated: {result.updated_count}, Up to date: {result.up_to_date_count}")
if result.has_errors:
    for item, error in result.errors:
        print(f"  {item.symbol}/{item.schema_}: {error}")

# Get cached data (raises CacheMissError if not cached)
from dbn_cache import CacheMissError

try:
    data = cache.get("ES.c.0", "ohlcv-1m", date(2024, 1, 1), date(2024, 12, 1))
except CacheMissError:
    print("Data not cached")

# Get data quality issues
issues = cache.get_quality_issues("ES.c.0", "ohlcv-1m")
for issue in issues:
    print(f"{issue.date}: {issue.issue_type}")

# Custom cache location
from pathlib import Path
cache = DataCache(cache_dir=Path("/path/to/cache"))

Supported Symbols

Stocks

  • AAPL - Apple Inc. (use with -d XNAS.ITCH or other equity datasets)

Options

  • SPX.OPT - All SPX options (use with -d OPRA.PILLAR)

Futures (CME Globex)

  • ESZ24 - Specific contract (E-mini S&P 500, December 2024)
  • ES.c.0 - Front month by calendar (safe for backtesting)
  • ES.v.0 - Front month by volume (has look-ahead bias)
  • ES.n.0 - Front month by open interest (has look-ahead bias)
  • ES.FUT - All contracts for a product

Common products: ES (S&P 500), NQ (Nasdaq), CL (Crude Oil), GC (Gold), 6E (Euro FX), 6J (Yen), ZB (Treasury Bonds)

Schemas

Run dbn schemas for the full list. Common schemas:

Schema Description Partition
trades Executed trades Daily
ohlcv-1m 1-minute OHLCV bars Monthly
ohlcv-1h Hourly OHLCV bars Monthly
ohlcv-1d Daily OHLCV bars Monthly
mbp-1 Top of book (L1) Daily
mbp-10 10 levels of book (L2) Daily
mbo Full order book Daily

Cache Structure

~/.databento/
└── GLBX.MDP3/
    └── ES_c_0/
        └── ohlcv-1m/
            ├── meta.json
            └── 2024/
                ├── 01.parquet
                ├── 02.parquet
                └── ...

Look-Ahead Bias Warning

When using continuous futures for backtesting:

  • ES.c.0 (calendar) - Roll dates are fixed, safe for backtesting
  • ⚠️ ES.v.0 (volume) - Roll dates determined by future volume data
  • ⚠️ ES.n.0 (open interest) - Roll dates determined by future OI data

For accurate backtesting, use calendar-based continuous contracts (.c.) or download individual contracts and implement your own roll logic.

Development

uv sync
uv run pytest
uv run ruff check .
uv run pyright

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbn_cache-1.0.3.tar.gz (100.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbn_cache-1.0.3-py3-none-any.whl (25.0 kB view details)

Uploaded Python 3

File details

Details for the file dbn_cache-1.0.3.tar.gz.

File metadata

  • Download URL: dbn_cache-1.0.3.tar.gz
  • Upload date:
  • Size: 100.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dbn_cache-1.0.3.tar.gz
Algorithm Hash digest
SHA256 5288176f923997f32a12f3053ebce6e6a4f17ff64770f5beadbceccb6646d744
MD5 075eae5ea5ac04ccfa789a4ca8185e45
BLAKE2b-256 ab8d49e256111675ede11b7f76921fdf8d2bdd8da07410dae3bf97abebaecc30

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbn_cache-1.0.3.tar.gz:

Publisher: publish.yml on azizuysal/dbn-cache

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbn_cache-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: dbn_cache-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 25.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dbn_cache-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e1e0adac1e6eb9b499c8d743994e664c6b7b23e1b0e491c92827ea5af26869bc
MD5 b63452ef6265beee531aef1dbb84e027
BLAKE2b-256 555938788a86cd816bbfc028a92d8f63dff5624b6d0b25c2471ce3546ee63cde

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbn_cache-1.0.3-py3-none-any.whl:

Publisher: publish.yml on azizuysal/dbn-cache

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page