Process Shopify bulk operation JSONL exports: trigger, download, parse parent/child, flatten to CSV/JSON.
Project description
shopify-bulk
A Python CLI and library for processing Shopify bulk operation JSONL exports. Handles the full lifecycle: trigger a bulk operation, poll for completion, download the result, parse the nested parent/child JSONL, and flatten it into clean CSV, JSON, or JSONL.
Why this exists
Shopify's Bulk Operations API returns data as JSONL where child records (variants, inventory levels, images) reference parents via __parentId. Every developer who uses bulk operations has to write custom code to reassemble this tree. There is no reusable library for it in Python, and Shopify's own Ruby SDK has had an open issue requesting this utility since 2023.
This tool fills that gap.
Install
pip install shopify-bulk
Or run directly from source:
git clone https://github.com/snowthen-o7/shopify-bulk.git
cd shopify-bulk
pip install -e .
Quick start
Fetch + process (full workflow)
# 1. Trigger a bulk operation on Shopify, poll until done, download the JSONL
shopify-bulk fetch --shop mystore.myshopify.com --token shpat_xxxxx -o export.jsonl
# 2. Parse the JSONL and flatten to CSV
shopify-bulk process export.jsonl -o catalog.csv
Process an existing JSONL file
# CSV output (default)
shopify-bulk process export.jsonl -o products.csv
# JSON output
shopify-bulk process export.jsonl -f json -o products.json
# Use a preset config for curated field selection
shopify-bulk process export.jsonl -c products -o catalog.csv
shopify-bulk process export.jsonl -c inventory -o stock.csv
# Select specific fields only
shopify-bulk process export.jsonl --fields title,handle,variant_sku,variant_price,inventory_total
Use as a Python library
from shopify_jsonl.parser import parse_jsonl_stream
from shopify_jsonl.expander import expand_products
with open("export.jsonl") as f:
for row in expand_products(parse_jsonl_stream(f)):
print(row["title"], row.get("variant_sku"), row.get("inventory_total"))
What it handles
- Streaming parser. Processes JSONL line by line. Never loads the full file into memory. Handles 50K+ product catalogs in seconds with constant memory usage.
- Parent/child assembly. Products, variants, inventory levels, and images are buffered one product at a time and flattened into output rows. The
__parentIdreassembly that everyone writes from scratch is built in. - Dynamic option columns. Shopify's
selectedOptions(Size, Color, Material, or whatever a store uses) become their own columns automatically. - Inventory aggregation by location. Each warehouse/location gets its own column plus a total.
- Both inventory API formats. Legacy
availablefield (pre-2024-04) and modernquantitiesarray (2024-04+) are normalized transparently. - Image fallback. Missing variant images fall back to the product's featured image, so every row has a usable image URL.
- Structural inference. Handles exports that lack
__typenameby inferring node types from their fields. Works with both modern and older Shopify API versions. - Preset configs. Built-in YAML presets for common use cases (products, inventory). Write your own for custom field selection.
Why not just use jsonlines?
The jsonlines Python package reads JSONL, but it gives you flat dictionaries with no awareness of Shopify's parent/child structure. You still have to:
- Detect whether a line is a Product, Variant, InventoryLevel, or Image
- Buffer children until the parent product completes
- Assemble variants under their parent product
- Aggregate inventory levels by location per variant
- Extract dynamic option names into columns
- Handle the two different inventory quantity formats
- Fall back to parent images when variants have none
That is what this tool does. jsonlines is step 0. This tool is steps 1 through 7.
Commands
shopify-bulk fetch
Trigger a Shopify bulk operation, poll until done, download the result JSONL.
Options:
--shop TEXT Shopify store domain (required)
--token TEXT Admin API access token, shpat_... (required)
-o, --output PATH Output JSONL path (default: export.jsonl)
--no-inventory Skip inventory levels (faster for large catalogs)
--api-version TEXT Shopify API version (default: 2026-01)
--max-wait FLOAT Max seconds to wait (default: 1200)
--poll-interval FLOAT Seconds between status polls (default: 5)
-v, --verbose Debug logging
shopify-bulk process
Parse a local JSONL file into CSV, JSON, or JSONL.
Options:
-o, --output PATH Output file path (default: stdout)
-f, --format [csv|json|jsonl] Output format (default: csv)
-c, --config NAME Preset config or path to .yaml
--no-variants One row per product instead of per variant
--no-inventory Skip per-location inventory breakdown
--fields TEXT Comma-separated fields (overrides config)
-v, --verbose Debug logging
shopify-bulk configs
List available preset configs.
Output fields
Each output row includes:
Product-level: id, title, body_html, vendor, product_type, handle, status, tags, image_url, product_url, product_category, seo_title, seo_description, published_at, total_inventory, additional_image_links, created_at, updated_at
Variant-level: variant_id, variant_title, variant_sku, variant_barcode, variant_price, compare_at_price, variant_image_url, weight, weight_unit, variant_position, variant_taxable, variant_available_for_sale, variant_inventory_policy
Dynamic options: size, color, material (or whatever option names the store uses)
Inventory: inventory_total, variant_inventory_quantity, inventory_location_{name}
Getting a Shopify access token
The fetch command needs a Shopify Admin API access token (shpat_...):
- In your Shopify admin, go to Settings > Apps and sales channels > Develop apps
- Create a custom app
- Under Admin API access scopes, enable
read_productsandread_inventory - Install the app and copy the Admin API access token
The process command works entirely offline and does not need a token.
Contributing
Issues and pull requests welcome. To set up a dev environment:
git clone https://github.com/snowthen-o7/shopify-bulk.git
cd shopify-bulk
pip install -e ".[dev]"
pytest
License
MIT. See LICENSE.
Built by SnowForge. For automated feed pipelines with scheduling, transforms, and multi-destination pushes, see SnowPipe.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shopify_bulk-0.1.1.tar.gz.
File metadata
- Download URL: shopify_bulk-0.1.1.tar.gz
- Upload date:
- Size: 22.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
be0fe114076d77b9defa1edfce84f8a6992ef2aa2cb5f92b3ad634d834481d0a
|
|
| MD5 |
7eb5200690aaee1fac7dcd449c65b54f
|
|
| BLAKE2b-256 |
c2ec67f44c1d3b62e48297523de7744280efcf7e1763187c398a08466a8678ff
|
File details
Details for the file shopify_bulk-0.1.1-py3-none-any.whl.
File metadata
- Download URL: shopify_bulk-0.1.1-py3-none-any.whl
- Upload date:
- Size: 23.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b61f7fe83eaee9450680335a99f875bd35066e93a1c2ebed277a6a2872070c6c
|
|
| MD5 |
d677f846b0adea25b4f1fe7bff9c06dd
|
|
| BLAKE2b-256 |
24883512ae115f193247544c13a303c9acc676a37b39c8800366578b74fc474e
|