Skip to main content

Process Shopify bulk operation JSONL exports: trigger, download, parse parent/child, flatten to CSV/JSON.

Project description

shopify-bulk

PyPI version Python 3.10+ License: MIT Tests

A Python CLI and library for processing Shopify bulk operation JSONL exports. Handles the full lifecycle: trigger a bulk operation, poll for completion, download the result, parse the nested parent/child JSONL, and flatten it into clean CSV, JSON, or JSONL.

Why this exists

Shopify's Bulk Operations API returns data as JSONL where child records (variants, inventory levels, images) reference parents via __parentId. Every developer who uses bulk operations has to write custom code to reassemble this tree. There is no reusable library for it in Python, and Shopify's own Ruby SDK has had an open issue requesting this utility since 2023.

This tool fills that gap.

Install

pip install shopify-bulk

Or run directly from source:

git clone https://github.com/snowthen-o7/shopify-bulk.git
cd shopify-bulk
pip install -e .

Quick start

Fetch + process (full workflow)

# 1. Trigger a bulk operation on Shopify, poll until done, download the JSONL
shopify-bulk fetch --shop mystore.myshopify.com --token shpat_xxxxx -o export.jsonl

# 2. Parse the JSONL and flatten to CSV
shopify-bulk process export.jsonl -o catalog.csv

Process an existing JSONL file

# CSV output (default)
shopify-bulk process export.jsonl -o products.csv

# JSON output
shopify-bulk process export.jsonl -f json -o products.json

# Use a preset config for curated field selection
shopify-bulk process export.jsonl -c products -o catalog.csv
shopify-bulk process export.jsonl -c inventory -o stock.csv

# Select specific fields only
shopify-bulk process export.jsonl --fields title,handle,variant_sku,variant_price,inventory_total

Use as a Python library

from shopify_jsonl.parser import parse_jsonl_stream
from shopify_jsonl.expander import expand_products

with open("export.jsonl") as f:
    for row in expand_products(parse_jsonl_stream(f)):
        print(row["title"], row.get("variant_sku"), row.get("inventory_total"))

What it handles

  • Streaming parser. Processes JSONL line by line. Never loads the full file into memory. Handles 50K+ product catalogs in seconds with constant memory usage.
  • Parent/child assembly. Products, variants, inventory levels, and images are buffered one product at a time and flattened into output rows. The __parentId reassembly that everyone writes from scratch is built in.
  • Dynamic option columns. Shopify's selectedOptions (Size, Color, Material, or whatever a store uses) become their own columns automatically.
  • Inventory aggregation by location. Each warehouse/location gets its own column plus a total.
  • Both inventory API formats. Legacy available field (pre-2024-04) and modern quantities array (2024-04+) are normalized transparently.
  • Image fallback. Missing variant images fall back to the product's featured image, so every row has a usable image URL.
  • Structural inference. Handles exports that lack __typename by inferring node types from their fields. Works with both modern and older Shopify API versions.
  • Preset configs. Built-in YAML presets for common use cases (products, inventory). Write your own for custom field selection.

Why not just use jsonlines?

The jsonlines Python package reads JSONL, but it gives you flat dictionaries with no awareness of Shopify's parent/child structure. You still have to:

  1. Detect whether a line is a Product, Variant, InventoryLevel, or Image
  2. Buffer children until the parent product completes
  3. Assemble variants under their parent product
  4. Aggregate inventory levels by location per variant
  5. Extract dynamic option names into columns
  6. Handle the two different inventory quantity formats
  7. Fall back to parent images when variants have none

That is what this tool does. jsonlines is step 0. This tool is steps 1 through 7.

Commands

shopify-bulk fetch

Trigger a Shopify bulk operation, poll until done, download the result JSONL.

Options:
  --shop TEXT             Shopify store domain (required)
  --token TEXT            Admin API access token, shpat_... (required)
  -o, --output PATH      Output JSONL path (default: export.jsonl)
  --no-inventory         Skip inventory levels (faster for large catalogs)
  --api-version TEXT     Shopify API version (default: 2026-01)
  --max-wait FLOAT       Max seconds to wait (default: 1200)
  --poll-interval FLOAT  Seconds between status polls (default: 5)
  -v, --verbose          Debug logging

shopify-bulk process

Parse a local JSONL file into CSV, JSON, or JSONL.

Options:
  -o, --output PATH              Output file path (default: stdout)
  -f, --format [csv|json|jsonl]  Output format (default: csv)
  -c, --config NAME              Preset config or path to .yaml
  --no-variants                  One row per product instead of per variant
  --no-inventory                 Skip per-location inventory breakdown
  --fields TEXT                  Comma-separated fields (overrides config)
  -v, --verbose                  Debug logging

shopify-bulk configs

List available preset configs.

Output fields

Each output row includes:

Product-level: id, title, body_html, vendor, product_type, handle, status, tags, image_url, product_url, product_category, seo_title, seo_description, published_at, total_inventory, additional_image_links, created_at, updated_at

Variant-level: variant_id, variant_title, variant_sku, variant_barcode, variant_price, compare_at_price, variant_image_url, weight, weight_unit, variant_position, variant_taxable, variant_available_for_sale, variant_inventory_policy

Dynamic options: size, color, material (or whatever option names the store uses)

Inventory: inventory_total, variant_inventory_quantity, inventory_location_{name}

Getting a Shopify access token

The fetch command needs a Shopify Admin API access token (shpat_...):

  1. In your Shopify admin, go to Settings > Apps and sales channels > Develop apps
  2. Create a custom app
  3. Under Admin API access scopes, enable read_products and read_inventory
  4. Install the app and copy the Admin API access token

The process command works entirely offline and does not need a token.

Contributing

Issues and pull requests welcome. To set up a dev environment:

git clone https://github.com/snowthen-o7/shopify-bulk.git
cd shopify-bulk
pip install -e ".[dev]"
pytest

License

MIT. See LICENSE.


Built by SnowForge. For automated feed pipelines with scheduling, transforms, and multi-destination pushes, see SnowPipe.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shopify_bulk-0.1.1.tar.gz (22.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shopify_bulk-0.1.1-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file shopify_bulk-0.1.1.tar.gz.

File metadata

  • Download URL: shopify_bulk-0.1.1.tar.gz
  • Upload date:
  • Size: 22.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for shopify_bulk-0.1.1.tar.gz
Algorithm Hash digest
SHA256 be0fe114076d77b9defa1edfce84f8a6992ef2aa2cb5f92b3ad634d834481d0a
MD5 7eb5200690aaee1fac7dcd449c65b54f
BLAKE2b-256 c2ec67f44c1d3b62e48297523de7744280efcf7e1763187c398a08466a8678ff

See more details on using hashes here.

File details

Details for the file shopify_bulk-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: shopify_bulk-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 23.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for shopify_bulk-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b61f7fe83eaee9450680335a99f875bd35066e93a1c2ebed277a6a2872070c6c
MD5 d677f846b0adea25b4f1fe7bff9c06dd
BLAKE2b-256 24883512ae115f193247544c13a303c9acc676a37b39c8800366578b74fc474e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page