A simple, fast, and robust async S3/HTTP downloader with parallel range requests.

These details have not been verified by PyPI

Project links

Homepage

Project description

S3impleClient

A simple, fast, and robust async S3/HTTP downloader and uploader with pipelined parallel transfers.

Features

Pipelined Parallel I/O: Download/upload large chunks while writing/reading the previous one
Two-Level Chunking: Large chunks for disk I/O, small chunks for network requests
Async/Sync Support: Use in both async and synchronous contexts
HuggingFace Hub Integration: Patch huggingface_hub for faster model downloads/uploads
Progress Tracking: Built-in tqdm progress bars with [S3C] prefix
Configurable Logging: Debug upload/download operations with configure_logging()
Automatic Fallback: Falls back to single-stream for servers without range support
Retry Logic: Exponential backoff retry for failed chunks

Installation

pip install s3impleclient

Quick Start

Download

import s3impleclient as s3c

# Synchronous download
result = s3c.download(
    url="https://example.com/large-file.bin",
    dest="./downloads/file.bin",
)

if result.success:
    print(f"Downloaded {result.total_bytes:,} bytes")

Upload (Multipart)

import s3impleclient as s3c

# Upload with pre-signed multipart URLs (from S3 or similar)
result = s3c.upload(
    file_path="./large-file.bin",
    part_urls=["https://s3.../part1", "https://s3.../part2", ...],
    chunk_size=64 * 1024 * 1024,  # 64MB per part (from server)
    completion_url="https://s3.../complete",  # optional
)

if result.success:
    print(f"Uploaded {result.total_bytes:,} bytes in {len(result.parts)} parts")

HuggingFace Hub Integration

import logging
import s3impleclient as s3c
from huggingface_hub import hf_hub_download, upload_folder

# Enable logging to see transfer details
s3c.configure_logging(logging.INFO)

# Patch both download and upload
s3c.patch_all()

# Downloads now use S3impleClient (look for [S3C] in progress bar)
path = hf_hub_download(
    repo_id="username/model",
    filename="model.safetensors",
)

# Uploads also use parallel multipart
upload_folder(
    folder_path="./my-model",
    repo_id="username/model",
)

# Restore original behavior
s3c.unpatch_all()

CLI Usage

# Download
s3c download https://example.com/file.bin
s3c download https://example.com/file.bin -o ./myfile.bin
s3c download https://example.com/file.bin -w 16 -c 20  # workers, chunk MB

# Upload (requires pre-signed URLs in JSON file)
s3c upload ./file.bin --url https://s3.../upload  # single part
s3c upload ./file.bin --part-urls parts.json --chunk-size 67108864  # multipart

How It Works

Download Pipeline

S3impleClient uses a pipelined approach for maximum throughput:

Time ->
┌─────────────────────────────────────────────────────────────┐
│ Download Large Chunk 0 (parallel HTTP range requests)       │
│                        │ Write Chunk 0 │ Download Chunk 1   │
│                                        │ Write 1 │ Download │
│                                                  │ Write... │
└─────────────────────────────────────────────────────────────┘

Two-level chunking:

Large chunks (128MB default): Units for disk writes - fits in memory, efficient I/O
Small chunks (4MB default): Units for HTTP range requests - parallel within large chunk

Large Chunk 0 (128MB)
├── HTTP Range 0-4MB      ─┐
├── HTTP Range 4-8MB       │
├── HTTP Range 8-12MB      ├── Parallel (8 workers)
├── ...                    │
└── HTTP Range 124-128MB  ─┘
         │
         ▼
    Write to disk (while downloading next large chunk)

Upload Pipeline

Similar pipelining for uploads with prefetch:

Time ->
┌─────────────────────────────────────────────────────────────┐
│ Read Large Chunk 0 (32 parts)                               │
│                          │ Upload Parts 0-7   (parallel)    │
│                          │ Upload Parts 8-15  (parallel)    │
│                          │ Upload Parts 16-23 (parallel)    │
│                          │ Upload Parts 24-31 │ Read Chunk 1│
│                                               │ Upload...   │
└─────────────────────────────────────────────────────────────┘

Upload chunking:

Large chunk: max_workers_per_file * prefetch_factor * part_size bytes read at once
Part size: Defined by server (e.g., 64MB for HuggingFace)
Parallel uploads: Limited by max_workers_per_file semaphore

With defaults (8 workers, 4 prefetch, 64MB parts):

Large chunk = 8 * 4 * 64MB = 2GB read into memory
8 parts upload in parallel at any time
While uploading, next 2GB is being read

Configuration

Download Config

import s3impleclient as s3c

s3c.configure_download(s3c.DownloadConfig(
    chunk_size=4 * 1024 * 1024,         # 4MB per HTTP request
    write_chunk_size=128 * 1024 * 1024, # 128MB per disk write
    max_workers=8,                       # Parallel HTTP requests
    timeout=30.0,
    max_retries=5,
))

Upload Config

s3c.configure_upload(s3c.UploadConfig(
    max_workers_per_file=8,   # Parallel uploads per file
    max_file_concurrency=4,   # Parallel files (for multi-file upload)
    prefetch_factor=4,        # Read 8*4=32 parts at once
    timeout=60.0,
    max_retries=5,
))

Logging

import logging
import s3impleclient as s3c

# See upload/download configuration
s3c.configure_logging(logging.INFO)

# See per-chunk progress details
s3c.configure_logging(logging.DEBUG)

API Reference

Download

Function	Description
`download(url, dest, ...)`	Sync download to file
`download_async(url, dest, ...)`	Async download to file
`configure_download(config)`	Set default download config
`Downloader(config)`	Create custom downloader instance

Upload

Function	Description
`upload(file_path, ...)`	Sync upload single file
`upload_async(file_path, ...)`	Async upload single file
`upload_files(files, ...)`	Sync upload multiple files
`upload_files_async(files, ...)`	Async upload multiple files
`configure_upload(config)`	Set default upload config
`Uploader(config)`	Create custom uploader instance

HuggingFace Patching

Function	Description
`patch_huggingface_hub(config)`	Patch downloads only
`patch_huggingface_hub_upload(config)`	Patch uploads only
`patch_all(dl_config, ul_config)`	Patch both
`unpatch_huggingface_hub()`	Restore original download
`unpatch_huggingface_hub_upload()`	Restore original upload
`unpatch_all()`	Restore both
`is_patched()`	Check download patch status
`is_upload_patched()`	Check upload patch status

Logging

Function	Description
`configure_logging(level)`	Set logging level (default: WARNING)

Documentation

See the docs/ directory for detailed documentation:

Concepts

Parallel Range Downloads - How parallel downloads work
Parallel Multipart Uploads - How parallel uploads work
HuggingFace Hub Download Flow - Download integration details
HuggingFace Hub Upload Flow - Upload integration details

Implementation

Architecture - Code structure and design
API Reference - Full API documentation

Examples

See the examples/ directory:

basic_download.py - Sync and async download usage
huggingface_download.py - HuggingFace Hub download integration
huggingface_patch.py - Patching details
progress_callback.py - Custom progress tracking
huggingface_upload.py - HuggingFace Hub upload integration

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.0.1

Dec 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

s3impleclient-0.0.1-py3-none-any.whl (30.8 kB view details)

Uploaded Dec 15, 2025 Python 3

File details

Details for the file s3impleclient-0.0.1-py3-none-any.whl.

File metadata

Download URL: s3impleclient-0.0.1-py3-none-any.whl
Upload date: Dec 15, 2025
Size: 30.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for s3impleclient-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c669dd10668047e56c1eda4ce9a681ede0e2a7f7dd5d74cb7c942d45915a606a`
MD5	`c860c024b6384f300b574141d61c56c2`
BLAKE2b-256	`e2531a73e9ececa367531b221c5fce95ca680060d4b40b2660bf079bedfb65fe`

See more details on using hashes here.

S3impleClient 0.0.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

S3impleClient

Features

Installation

Quick Start

Download

Upload (Multipart)

HuggingFace Hub Integration

CLI Usage

How It Works

Download Pipeline

Upload Pipeline

Configuration

Download Config

Upload Config

Logging

API Reference

Download

Upload

HuggingFace Patching

Logging

Documentation

Concepts

Implementation

Examples

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes