Download Polygon (Massive) options flat files from S3 and store as compressed Parquet
Project description
polygon-options-puller
Download Polygon / Massive US options (OPRA) flat files from their S3 bucket and store them locally as Snappy-compressed, dictionary-encoded Parquet files, filtered by symbol prefix.
How it works
Polygon ships daily .csv.gz files containing all option tickers for an
entire trading day. The quote files alone are ~120 GB compressed each.
This tool streams each file directly from S3, filters to your symbol prefix
in-flight, and writes only matching rows to Parquet — no temp files, no
downloading 120 GB just to keep 500 MB.
Key features:
- Streaming: decompresses and filters in-flight, never writes the full CSV to disk
- Parallel: uses a thread pool to download multiple days concurrently
- NYSE-aware: uses
pandas_market_calendarsto skip holidays and weekends - Idempotent: re-running skips days that already have valid Parquet files
- Atomic writes: uses temp files +
os.replace()to prevent corrupt output
Installation
pip install .
# or in editable mode for development:
pip install -e ".[dev]"
Credentials
You need Polygon / Massive S3 credentials. Get them from your Massive dashboard.
export POLYGON_S3_ACCESS_KEY="your-access-key"
export POLYGON_S3_SECRET_KEY="your-secret-key"
Usage
Download data
# Download AAPL option quotes for a date range
polygon-options-puller download \
--symbol-prefix AAPL \
-t quotes \
--start-date 2025-03-17 \
--end-date 2025-03-21 \
-o ./data/aapl
# Download SPXW trades with 16 workers
polygon-options-puller download \
--symbol-prefix SPXW \
-t trades \
--start-date 2025-04-01 \
--end-date 2025-04-30 \
-o ./data/spxw \
--workers 16
# Download both trades and quotes
polygon-options-puller download \
--symbol-prefix SPY \
-t both \
--start-date 2025-04-01 \
--end-date 2025-04-02 \
-o ./data/spy
# Download minute aggregates
polygon-options-puller download \
--symbol-prefix AAPL \
-t minute_aggs \
--start-date 2025-01-02 \
--end-date 2025-01-02 \
-o ./data/aapl
List available dates
# List all available quote files
polygon-options-puller list-dates
# List files for a specific year/month
polygon-options-puller list-dates --year 2024 --month 3
Python API
from datetime import date
from polygon_options_puller.downloader import pull
written = pull(
access_key="your-key",
secret_key="your-secret",
output_dir="data/aapl",
data_types=["quotes"],
symbol_prefix="AAPL",
start_date=date(2025, 3, 17),
end_date=date(2025, 3, 21),
workers=8,
)
Output layout
data/aapl/
├── quotes/
│ ├── 2025-03-17.parquet
│ ├── 2025-03-18.parquet
│ ├── 2025-03-19.parquet
│ ├── 2025-03-20.parquet
│ └── 2025-03-21.parquet
└── trades/
├── 2025-03-17.parquet
└── ...
Each Parquet file contains only rows matching the --symbol-prefix you
specified. Namespace different underlyings by using different --output-dir
paths.
Data types
| Type | S3 prefix | Description |
|---|---|---|
quotes |
us_options_opra/quotes_v1 |
Top-of-book quotes, nanosecond timestamps |
trades |
us_options_opra/trades_v1 |
Tick-level trades, nanosecond timestamps |
day_aggs |
us_options_opra/day_aggs_v1 |
Daily OHLCV candles |
minute_aggs |
us_options_opra/minute_aggs_v1 |
Minute OHLCV candles |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polygon_options_puller-0.2.0.tar.gz.
File metadata
- Download URL: polygon_options_puller-0.2.0.tar.gz
- Upload date:
- Size: 50.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7a61073298c39f22dfc0f219f89fab785e613ff86219c82c2ea33429b70966a
|
|
| MD5 |
d8d024feb34d1af273f06d81d279a652
|
|
| BLAKE2b-256 |
86dd3b915686bec9d18aa26b5a200462072b1079e8c241d267893fd98afb98c5
|
Provenance
The following attestation bundles were made for polygon_options_puller-0.2.0.tar.gz:
Publisher:
release.yml on marwinsteiner/polygon-options-puller
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polygon_options_puller-0.2.0.tar.gz -
Subject digest:
e7a61073298c39f22dfc0f219f89fab785e613ff86219c82c2ea33429b70966a - Sigstore transparency entry: 1082587747
- Sigstore integration time:
-
Permalink:
marwinsteiner/polygon-options-puller@85363fa5cfc0a7ced26fc3e8e91c845f09c61bb9 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/marwinsteiner
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@85363fa5cfc0a7ced26fc3e8e91c845f09c61bb9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file polygon_options_puller-0.2.0-py3-none-any.whl.
File metadata
- Download URL: polygon_options_puller-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e48b8f94f797e4e11ebdd8e4775d73aa5b76ec9655b5e3c7e7f4b944d34cd5a
|
|
| MD5 |
3d8ccc22934e9d10d3b49b6567dccbaa
|
|
| BLAKE2b-256 |
de2e22bc9071db61f54d307e311f018945993dec9e978916099b9f6d36dc8205
|
Provenance
The following attestation bundles were made for polygon_options_puller-0.2.0-py3-none-any.whl:
Publisher:
release.yml on marwinsteiner/polygon-options-puller
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polygon_options_puller-0.2.0-py3-none-any.whl -
Subject digest:
7e48b8f94f797e4e11ebdd8e4775d73aa5b76ec9655b5e3c7e7f4b944d34cd5a - Sigstore transparency entry: 1082587804
- Sigstore integration time:
-
Permalink:
marwinsteiner/polygon-options-puller@85363fa5cfc0a7ced26fc3e8e91c845f09c61bb9 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/marwinsteiner
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@85363fa5cfc0a7ced26fc3e8e91c845f09c61bb9 -
Trigger Event:
push
-
Statement type: