Skip to main content

Resumable, cursor-based, CDN-safe HTTP downloads for Python

Project description

pyhaul

CI codecov PyPI License: MIT Docs

Resumable HTTP downloads for Python. Bring your own client: pyhaul borrows your existing session and handles byte-range negotiation, crash-safe checkpointing, and validation.

httpx niquests aiohttp requests urllib3

pip install pyhaul[httpx]   # or: niquests, requests, urllib3, aiohttp
import httpx
from pathlib import Path
from pyhaul import haul, PartialHaulError

dest = Path("big.zip")
with httpx.Client() as client:
    for _ in range(10):
        try:
            result = haul("https://example.com/big.zip", client, dest=dest)
            break
        except PartialHaulError:
            pass  # only retryable error; others propagate

print(f"done: {dest.stat().st_size:,} bytes")

What is it?

A small, pure-Python library that makes HTTP downloads resumable. To download a file, call haul() with a URL, your existing HTTP client, and a destination path. pyhaul handles byte-range negotiation for resume, ETag validation, crash-safe checkpointing, and atomic file completion. Supports both sync and async across multiple HTTP client libraries.

Each call to haul() upholds these guarantees:

  • One haul() makes one request. You are responsible for retry loops, but retry just means call haul() again.
  • The destination file will not exist until download is complete. There is no state where a partially-written file sits at the final path. Incomplete data lives in a temporary .part file; on completion it is atomically moved into place.
  • Interrupted downloads resume when possible. Checkpoint state lives on disk, not in memory. Kill the process, lose the network, get a 503 — the next haul() picks up from the last durable byte. Zero re-downloaded data if the resource hasn't changed.
  • If the remote resource changes, retry will not corrupt. If the remote file changes between attempts, pyhaul detects the mismatch via ETag (a server-side fingerprint) and starts over cleanly instead of gluing mismatched halves together.
  • Your HTTP client is borrowed, not owned. pyhaul sets per-request headers and returns your session untouched. It never creates, configures, or closes sessions.
  • Transport errors pass through unwrapped. httpx.ReadTimeout stays httpx.ReadTimeout. You catch the types you already know.

How it fits into your code

One haul() = one HTTP request. It either succeeds and returns CompleteHaul, or it throws — possibly after saving progress to a .part file that allows the next call to resume. pyhaul never creates sessions, connections, or clients. Your HTTP library's native exceptions propagate through unwrapped, so you can drop haul() into existing code without changing your error handling. Retries are your call — a for-loop, tenacity, or nothing. Concurrency limiting (e.g. asyncio.Semaphore) is also yours — pyhaul downloads one file per call and doesn't manage parallelism.

def haul(url, client, *, dest) -> CompleteHaul: ...
async def haul_async(url, client, *, dest) -> CompleteHaul: ...

Optional HaulState (progress bag, updated in-place) and other keyword-only options (extra headers, progress hooks, buffer sizing) are documented on the site. See docs/DESIGN.md for the exception hierarchy, transport adapters, and download lifecycle.

Documentation

Full documentation →

  • docs/DESIGN.md — Transport adapters, checkpoint state, and the download lifecycle.
  • docs/WHY.md — Silent failure modes in HTTP range/resume, and how pyhaul compares to curl, wget, and aria2c.
  • docs/SPEC.md — Control file and checkpoint format (implementers / compatible tools).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhaul-0.6.1.tar.gz (144.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyhaul-0.6.1-py3-none-any.whl (68.7 kB view details)

Uploaded Python 3

File details

Details for the file pyhaul-0.6.1.tar.gz.

File metadata

  • Download URL: pyhaul-0.6.1.tar.gz
  • Upload date:
  • Size: 144.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyhaul-0.6.1.tar.gz
Algorithm Hash digest
SHA256 3fb12e64603b47d98362b2a522fbc875915d66cf49763131f740014260e7862d
MD5 da2292818537fb66edf7d691033e0384
BLAKE2b-256 87a546951cbce5d83dff98a8174f78039217d0f2c946a85c2313700622a30a00

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyhaul-0.6.1.tar.gz:

Publisher: release.yml on chad-loder/pyhaul

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyhaul-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: pyhaul-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 68.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyhaul-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4cd1229a7c7f73730269a480f3ef06b8be744cc4b7ad5f51e4e94012a2e3e18e
MD5 8c7edd1a4650220453459c38eee7c357
BLAKE2b-256 37c218e636e63544ef900f8b83d22b0af693c85bb30678015d97b5f2d1737a5f

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyhaul-0.6.1-py3-none-any.whl:

Publisher: release.yml on chad-loder/pyhaul

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page