Skip to main content

Reproducible market data downloader

Project description

███╗   ███╗ █████╗ ██████╗ ██╗  ██╗███████╗████████╗██████╗ ██╗     
████╗ ████║██╔══██╗██╔══██╗██║ ██╔╝██╔════╝╚══██╔══╝██╔══██╗██║     
██╔████╔██║███████║██████╔╝█████╔╝ █████╗     ██║   ██║  ██║██║     
██║╚██╔╝██║██╔══██║██╔══██╗██╔═██╗ ██╔══╝     ██║   ██║  ██║██║     
██║ ╚═╝ ██║██║  ██║██║  ██║██║  ██╗███████╗   ██║   ██████╔╝███████╗
╚═╝     ╚═╝╚═╝  ╚═╝╚═╝  ╚═╝╚═╝  ╚═╝╚══════╝   ╚═╝   ╚═════╝ ╚══════╝
-------------------------------------------------------------------
Reproducible market data downloader

Tests PyPI Latest Release License: MIT

Want a simple and reproducible way to download market data? Try marketdl - a CLI program that downloads trading data based on a YAML configuration file.

This project evolved from a large collection of Python scripts I used for downloading and managing market data. While primarily developed for my own needs, the tool is designed to be useful for anyone. It is architecured to be expandable with more data providers and data formats in the future.

Features

  • ⚡ Asynchronous concurrent downloads
  • 📦 Chunks large downloads automatically
  • 📊 Multiple data types (aggregates, quotes, trades)
  • 💾 Smart downloads - Only downloads missing files
  • 📝 Flexible storage formats (Parquet, CSV)
  • 📈 Progress tracking and detailed logging
  • 🔄 Configurable retry logic and rate limiting
  • ⚙️ YAML-based configuration

For now, the only supported data provider is Polygon.io.

Installation

Use pip to install marketdl:

pip install marketdl

Usage

marketdl supports the following commands:

  • init: Generate sample configuration
  • validate: Validate configuration file
  • download: Download data based on configuration

Quick Start

  1. Get your API key from appropriate website
  2. Generate a config file:
marketdl init
  1. Edit the generated config.yaml with your symbols and date ranges
  2. Run the downloader:
marketdl download --api-key YOUR_API_KEY
marketdl download # (Reads POLYGON_API_KEY)

Examples

# Generate config
python -m marketdl init -o my_config.yaml

# Validate config
python -m marketdl validate my_config.yaml

# Download with specific config
python -m marketdl download -c my_config.yaml -k YOUR_API_KEY

# Dry run to see what would be downloaded
python -m marketdl download -c my_config.yaml -k YOUR_API_KEY --dry-run

Configuration

Example config.yaml:

api:
  service: polygon
  timeout: 30
  max_retries: 3
  retry_delay: 1.0

storage:
  base_path: data
  format: parquet
  compress: true

downloads:
  - symbols:
      - AAPL
      - MSFT
    data_types:
      - aggregates
      - quotes
      - trades
    frequencies:
      - 1minute
    start_date: '2024-01-01'
    end_date: '2024-01-31'
  - symbols:
      - C:EURUSD
      - X:BTCUSD
    data_types:
      - aggregates
    frequencies:
      - 1week
      - 1month
    start_date: '2018-01-01'
    end_date: '2023-12-31'

max_concurrent: 5

Names of symbols will match the data provider. For Polygon.io, see Screener.

Data Storage

Data is stored in a hierarchical structure by symbol, data type, and frequency. Second-level and minute-level data is automatically split into daily files, while hourly and higher frequencies can span multiple days in a single file. File names contain the date or date range of the data.

data/
├── C:EURUSD/
│   └── aggregates/
│       ├── 4hour/
│       │   └── 2023-12-26_2023-12-31.csv.gz     # Multi-day data for lower frequencies
│       └── 5minute/
│           ├── 2023-12-26.csv.gz                # One file per day for minute data
│           ├── 2023-12-27.csv.gz
│           └── 2023-12-28.csv.gz
└── X:BTCUSD/
    └── aggregates/
        └── 5minute/
            └── 2023-12-26.csv.gz

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marketdl-0.1.1.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

marketdl-0.1.1-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file marketdl-0.1.1.tar.gz.

File metadata

  • Download URL: marketdl-0.1.1.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for marketdl-0.1.1.tar.gz
Algorithm Hash digest
SHA256 240cf6e821f945b4185c2bad3391bde788f4c757797cfdb0e41408ad25f425fd
MD5 b26932644c555ff3e496c6499695a9cc
BLAKE2b-256 00dc7546889e5c78bfb8063c4258b5cc408f644d77472ce98432a1391acfb8c5

See more details on using hashes here.

File details

Details for the file marketdl-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: marketdl-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for marketdl-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 41199c1a71422575516dba043cd3b07e716c85222f5e16c4d26e50e2d90a4903
MD5 b7639c075099672fefa7ca2260628530
BLAKE2b-256 3fd6ad26b7bcdddcb59f660367d587cc400fd38e6c112c5f82034cdb57b3c3a2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page