Download and cache historical market data from Databento
Project description
dbn-cache
Download and cache historical market data from Databento.
Installation
As a library
uv add dbn-cache
# or
pip install dbn-cache
CLI only (global install)
uv tool install dbn-cache
# or
pipx install dbn-cache
# or
mise use -g pipx:dbn-cache
Configuration
Set your Databento API key:
export DATABENTO_API_KEY=db-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Optionally configure cache location:
export DATABENTO_CACHE_DIR=/path/to/cache
Default cache locations:
- Unix/Mac:
~/.databento - Windows:
%LOCALAPPDATA%\databento
CLI Usage
The CLI is available as dbn (or dbn-cache):
# Show help
dbn -h
dbn download -h
# Download E-mini S&P 500 continuous futures (1-minute OHLCV)
dbn download ES.c.0 --schema ohlcv-1m --start 2024-01-01 --end 2024-12-01
# Download specific contract
dbn download ESZ24 --schema trades --start 2024-11-01 --end 2024-12-01
# Download from different dataset (default: GLBX.MDP3)
dbn download AAPL --schema trades --start 2024-01-01 --end 2024-01-31 -d XNAS.ITCH
# Update cached data to yesterday (historical data has 24h delay)
dbn update ES.c.0 # Update all schemas for symbol
dbn update ES.c.0 -s ohlcv-1m # Update specific schema
dbn update --all # Update everything in cache
# List cached data
dbn list
# Show info for specific symbol
dbn info ES.c.0 --schema ohlcv-1m
# Show data quality issues
dbn quality ES.c.0 --schema ohlcv-1m
# Estimate cost before downloading
dbn cost ES.c.0 --schema trades --start 2024-01-01 --end 2024-12-01
# Verify cache integrity (check for missing files)
dbn verify
dbn verify --fix # Rebuild missing metadata and remove stale entries
# Reference commands
dbn datasets # List available datasets
dbn schemas # List available schemas
dbn symbols # Show symbol format examples
Shell Completions
# Zsh (add to .zshrc)
eval "$(dbn completions zsh)"
# Bash (add to .bashrc)
eval "$(dbn completions bash)"
# Fish
dbn completions fish > ~/.config/fish/completions/dbn.fish
# PowerShell (Windows)
dbn completions powershell >> $PROFILE
Cancellation & Error Handling
- Press
Ctrl+Cto cancel gracefully; partial downloads are saved and can be resumed - All errors are caught and displayed with clear messages (no unhandled exceptions)
Library Usage
from datetime import date
from dbn_cache import DataCache
# Initialize cache (uses ~/.databento by default)
cache = DataCache()
# Download and cache data
data = cache.download("ES.c.0", "ohlcv-1m", date(2024, 1, 1), date(2024, 12, 1))
# Get as Polars LazyFrame
df = data.to_polars().collect()
# Or as Pandas DataFrame
df = data.to_pandas()
# Ensure data is cached (downloads only if missing)
data = cache.ensure("ES.c.0", "ohlcv-1m", date(2024, 1, 1), date(2024, 12, 1))
# Update cached data to yesterday (returns None if already up to date)
data = cache.update("ES.c.0", "ohlcv-1m") # Dataset inferred from cache
# Update all cached data
result = cache.update_all()
print(f"Updated: {result.updated_count}, Up to date: {result.up_to_date_count}")
if result.has_errors:
for item, error in result.errors:
print(f" {item.symbol}/{item.schema_}: {error}")
# Get cached data (raises CacheMissError if not cached)
from dbn_cache import CacheMissError
try:
data = cache.get("ES.c.0", "ohlcv-1m", date(2024, 1, 1), date(2024, 12, 1))
except CacheMissError:
print("Data not cached")
# Get data quality issues
issues = cache.get_quality_issues("ES.c.0", "ohlcv-1m")
for issue in issues:
print(f"{issue.date}: {issue.issue_type}")
# Repair orphaned parquet files (missing metadata)
repaired = cache.repair_metadata()
for dataset, symbol, schema in repaired:
print(f"Rebuilt metadata for {symbol}/{schema}")
# Custom cache location
from pathlib import Path
cache = DataCache(cache_dir=Path("/path/to/cache"))
Supported Symbols
Stocks
AAPL- Apple Inc. (use with-d XNAS.ITCHor other equity datasets)
Options
SPX.OPT- All SPX options (use with-d OPRA.PILLAR)
Futures (CME Globex)
ESZ24- Specific contract (E-mini S&P 500, December 2024)ES.c.0- Front month by calendar (safe for backtesting)ES.v.0- Front month by volume (has look-ahead bias)ES.n.0- Front month by open interest (has look-ahead bias)ES.FUT- All contracts for a product
Common products: ES (S&P 500), NQ (Nasdaq), CL (Crude Oil), GC (Gold), 6E (Euro FX), 6J (Yen), ZB (Treasury Bonds)
Schemas
Run dbn schemas for the full list. Common schemas:
| Schema | Description | Partition |
|---|---|---|
trades |
Executed trades | Daily |
ohlcv-1m |
1-minute OHLCV bars | Monthly |
ohlcv-1h |
Hourly OHLCV bars | Monthly |
ohlcv-1d |
Daily OHLCV bars | Monthly |
mbp-1 |
Top of book (L1) | Daily |
mbp-10 |
10 levels of book (L2) | Daily |
mbo |
Full order book | Daily |
Cache Structure
~/.databento/
└── GLBX.MDP3/
└── ES_c_0/
└── ohlcv-1m/
├── meta.json
└── 2024/
├── 01.parquet
├── 02.parquet
└── ...
Look-Ahead Bias Warning
When using continuous futures for backtesting:
- ✅
ES.c.0(calendar) - Roll dates are fixed, safe for backtesting - ⚠️
ES.v.0(volume) - Roll dates determined by future volume data - ⚠️
ES.n.0(open interest) - Roll dates determined by future OI data
For accurate backtesting, use calendar-based continuous contracts (.c.) or download individual contracts and implement your own roll logic.
Market Calendar Integration
Downloads automatically skip market holidays and non-trading days using exchange calendars:
| Dataset | Calendar | Holiday Behavior |
|---|---|---|
GLBX.MDP3 |
CME | Open most holidays with early close |
OPRA.PILLAR |
NYSE | Closed on federal holidays |
XNAS.ITCH |
NYSE | Closed on federal holidays |
DBEQ.BASIC |
NYSE | Closed on federal holidays |
This prevents API errors when downloading tick data on days when markets are closed.
Development
uv sync
uv run pytest
uv run ruff check .
uv run pyright
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbn_cache-1.1.0.tar.gz.
File metadata
- Download URL: dbn_cache-1.1.0.tar.gz
- Upload date:
- Size: 109.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10e8fce82f35d94790a0e0fce2b0943651429fa664a00ee7cc83f0a76029c0a8
|
|
| MD5 |
264598c4cee43e7f7da4908ff3e17bdc
|
|
| BLAKE2b-256 |
29801846af783fd93a6e5da31447bc61344ff5f786794dcc27649df7f4ef1795
|
Provenance
The following attestation bundles were made for dbn_cache-1.1.0.tar.gz:
Publisher:
publish.yml on azizuysal/dbn-cache
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbn_cache-1.1.0.tar.gz -
Subject digest:
10e8fce82f35d94790a0e0fce2b0943651429fa664a00ee7cc83f0a76029c0a8 - Sigstore transparency entry: 828846573
- Sigstore integration time:
-
Permalink:
azizuysal/dbn-cache@3e20e49040c931240bac5fedddb5b0e55dfbfd9b -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/azizuysal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3e20e49040c931240bac5fedddb5b0e55dfbfd9b -
Trigger Event:
release
-
Statement type:
File details
Details for the file dbn_cache-1.1.0-py3-none-any.whl.
File metadata
- Download URL: dbn_cache-1.1.0-py3-none-any.whl
- Upload date:
- Size: 28.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c140e98cc9139e9f91025c22f0678b2979ec0340c74a2657d85f89ce40d13925
|
|
| MD5 |
e0bd8eea8b5ae26738da81265019d9d1
|
|
| BLAKE2b-256 |
ec937b9cc5eef25429902b52d53a1b080be989c69cb887d03ee510326efdc1fd
|
Provenance
The following attestation bundles were made for dbn_cache-1.1.0-py3-none-any.whl:
Publisher:
publish.yml on azizuysal/dbn-cache
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbn_cache-1.1.0-py3-none-any.whl -
Subject digest:
c140e98cc9139e9f91025c22f0678b2979ec0340c74a2657d85f89ce40d13925 - Sigstore transparency entry: 828846583
- Sigstore integration time:
-
Permalink:
azizuysal/dbn-cache@3e20e49040c931240bac5fedddb5b0e55dfbfd9b -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/azizuysal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3e20e49040c931240bac5fedddb5b0e55dfbfd9b -
Trigger Event:
release
-
Statement type: