Skip to main content

Reproducible market data downloader

Project description

███╗   ███╗ █████╗ ██████╗ ██╗  ██╗███████╗████████╗██████╗ ██╗     
████╗ ████║██╔══██╗██╔══██╗██║ ██╔╝██╔════╝╚══██╔══╝██╔══██╗██║     
██╔████╔██║███████║██████╔╝█████╔╝ █████╗     ██║   ██║  ██║██║     
██║╚██╔╝██║██╔══██║██╔══██╗██╔═██╗ ██╔══╝     ██║   ██║  ██║██║     
██║ ╚═╝ ██║██║  ██║██║  ██║██║  ██╗███████╗   ██║   ██████╔╝███████╗
╚═╝     ╚═╝╚═╝  ╚═╝╚═╝  ╚═╝╚═╝  ╚═╝╚══════╝   ╚═╝   ╚═════╝ ╚══════╝
-------------------------------------------------------------------
Reproducible market data downloader

Tests PyPI Latest Release License: MIT

Want a simple and reproducible way to download market data? Try marketdl - a CLI program that downloads trading data based on a YAML configuration file.

This project evolved from a large collection of Python scripts I used for downloading and managing market data. While primarily developed for my own needs, the tool is designed to be useful for anyone. It is architecured to be expandable with more data providers and data formats in the future.

Features

  • ⚡ Asynchronous concurrent downloads
  • 📦 Chunks large downloads automatically
  • 📊 Multiple data types (aggregates, quotes, trades)
  • 💾 Smart downloads - Only downloads missing files
  • 📝 Flexible storage formats (Parquet, CSV)
  • 📈 Progress tracking and detailed logging
  • 🔄 Configurable retry logic and rate limiting
  • ⚙️ YAML-based configuration

For now, the only supported data provider is Polygon.io.

Installation

Use pip to install marketdl:

pip install marketdl

Usage

marketdl supports the following commands:

  • init: Generate sample configuration
  • validate: Validate configuration file
  • download: Download data based on configuration

Quick Start

  1. Get your API key from appropriate website
  2. Generate a config file:
marketdl init
  1. Edit the generated config.yaml with your symbols and date ranges
  2. Run the downloader:
marketdl download --api-key YOUR_API_KEY
marketdl download # (Reads POLYGON_API_KEY)

Examples

# Generate config
python -m marketdl init -o my_config.yaml

# Validate config
python -m marketdl validate my_config.yaml

# Download with specific config
python -m marketdl download -c my_config.yaml -k YOUR_API_KEY

# Dry run to see what would be downloaded
python -m marketdl download -c my_config.yaml -k YOUR_API_KEY --dry-run

Configuration

Example config.yaml:

api:
  service: polygon
  timeout: 30
  max_retries: 3
  retry_delay: 1.0

storage:
  base_path: data
  format: parquet
  compress: true

downloads:
  - symbols:
      - AAPL
      - MSFT
    data_types:
      - aggregates
      - quotes
      - trades
    frequencies:
      - 1minute
    start_date: '2024-01-01'
    end_date: '2024-01-31'
  - symbols:
      - C:EURUSD
      - X:BTCUSD
    data_types:
      - aggregates
    frequencies:
      - 1week
      - 1month
    start_date: '2018-01-01'
    end_date: '2023-12-31'

max_concurrent: 5

Names of symbols will match the data provider. For Polygon.io, see Screener.

Data Storage

Data is stored in a hierarchical structure by symbol, data type, and frequency. Second-level and minute-level data is automatically split into daily files, while hourly and higher frequencies can span multiple days in a single file. File names contain the date or date range of the data.

data/
├── C:EURUSD/
│   └── aggregates/
│       ├── 4hour/
│       │   └── 2023-12-26_2023-12-31.csv.gz     # Multi-day data for lower frequencies
│       └── 5minute/
│           ├── 2023-12-26.csv.gz                # One file per day for minute data
│           ├── 2023-12-27.csv.gz
│           └── 2023-12-28.csv.gz
└── X:BTCUSD/
    └── aggregates/
        └── 5minute/
            └── 2023-12-26.csv.gz

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marketdl-0.1.0.tar.gz (22.0 kB view details)

Uploaded Source

Built Distribution

marketdl-0.1.0-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file marketdl-0.1.0.tar.gz.

File metadata

  • Download URL: marketdl-0.1.0.tar.gz
  • Upload date:
  • Size: 22.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for marketdl-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3c8849c059a318e73fae684d2027eb151492374a13fb6b31db6d6f51b52506a8
MD5 1d41e946fbd2e50ac668cd91ad4c300a
BLAKE2b-256 e12b554b1d3c325bc8e64ad13b12ea410ac1a21bd10a40574bfcebee08f35512

See more details on using hashes here.

File details

Details for the file marketdl-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: marketdl-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for marketdl-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 769f87eba3ccd64e300c4d12aa4a3f5594a5259b96087724615cf5e2b3c43a01
MD5 c17f0f8d1af31d55e80cbe10803a33f6
BLAKE2b-256 b49251eac1284089eaef2ec8798dfc9342ce5fdedbb497370656df5dd6587413

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page