Skip to main content

Resumable, cursor-based, CDN-safe HTTP downloads for Python

Project description

pyhaul

CI codecov PyPI License: MIT

Resumable HTTP downloads for Python. Bring your own client: pyhaul borrows your existing session and handles byte-range negotiation, crash-safe checkpointing, and validation.

httpx niquests aiohttp requests urllib3

pip install pyhaul[httpx]   # or: niquests, requests, urllib3, aiohttp
import httpx  # or: requests, niquests, urllib3, aiohttp
from pyhaul import haul

with httpx.Client() as client:
    result = haul("https://example.com/big.zip", client, dest="big.zip")
    print(f"done: sha256={result.sha256[:16]}…")

What is it?

A small, pure-Python library that makes HTTP downloads resumable. Call haul() with your existing HTTP client, a URL, and a destination path — it handles byte-range negotiation, ETag validation, crash-safe checkpointing, and atomic file completion. Sync and async; works with requests, httpx, niquests, urllib3, and aiohttp (async).

Each call to haul() upholds these guarantees:

  • The destination file is either complete or absent. There is no state where a partially-written file sits at the final path. Incomplete data lives in a temporary .part file; on completion it is atomically moved into place.
  • Interrupted downloads resume, not restart. Checkpoint state lives on disk, not in memory. Kill the process, lose the network, get a 503 — the next haul() picks up from the last durable byte. Zero re-downloaded data if the resource hasn't changed.
  • Changed resources are detected, not silently corrupted. If the remote file changes between attempts, pyhaul detects the mismatch via ETag (a server-side fingerprint) and starts over cleanly instead of gluing mismatched halves together.
  • Your HTTP client is borrowed, not owned. pyhaul sets per-request headers and returns the session untouched. It never creates, configures, or closes sessions.
  • Transport errors pass through unwrapped. httpx.ReadTimeout stays httpx.ReadTimeout. You catch the types you already know.

How it fits into your code

One haul() = one HTTP request. It either succeeds and returns CompleteHaul, or it throws — possibly after saving progress to a .part file that allows the next call to resume. pyhaul never creates sessions, connections, or clients. Your HTTP library's native exceptions propagate through unwrapped, so you can drop haul() into existing code without changing your error handling. Retries are your call — a for-loop, tenacity, or nothing. Concurrency limiting (e.g. asyncio.Semaphore) is also yours — pyhaul downloads one file per call and doesn't manage parallelism.

def haul(url, client, *, dest, state=None) -> CompleteHaul: ...
async def haul_async(url, client, *, dest, state=None) -> CompleteHaul: ...

state is an optional HaulState bag, updated in-place as bytes land on disk — works identically in sync and async. See DESIGN.md for the exception hierarchy, transport adapters, and download lifecycle.

Documentation

  • Design — transport adapters, checkpoint state, download lifecycle
  • Why this exists — failure modes and comparison with curl / wget / aria2c
  • Specification — control file format

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhaul-0.4.0.tar.gz (39.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyhaul-0.4.0-py3-none-any.whl (49.4 kB view details)

Uploaded Python 3

File details

Details for the file pyhaul-0.4.0.tar.gz.

File metadata

  • Download URL: pyhaul-0.4.0.tar.gz
  • Upload date:
  • Size: 39.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyhaul-0.4.0.tar.gz
Algorithm Hash digest
SHA256 75b4caa6d95beeeb2362de120e97a98dd383b1690a2ee960d5bbfa5ffae42c18
MD5 f00ff78098b08818c391bd623b84429b
BLAKE2b-256 f3a5d79bda1db73ec6d99b3bcc19481e382d3204d16813aa0aa0c7b951bf9ec9

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyhaul-0.4.0.tar.gz:

Publisher: release.yml on chad-loder/pyhaul

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyhaul-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: pyhaul-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 49.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyhaul-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 301c77cbf34584bd16e3fdae9c3cf637a56d05d650d069590ea82241ed70edae
MD5 2e8cd5ae53b0ae401f3a836bf3cd6d71
BLAKE2b-256 cfdac02a4ec79c6a287b63cf8068a908a8049d44b3c477e7f7af8dcf1aadfd6e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyhaul-0.4.0-py3-none-any.whl:

Publisher: release.yml on chad-loder/pyhaul

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page