Download manager for Python
Project description
A Download Manager Python Module
A flexible, cache-aware download manager for Python, supporting multiple backends (requests, pycurl), with integrated caching and metadata management.
Features
- Multiple Backends: Choose between
requestsandpycurlfor downloads. - Cache Integration: Seamless integration with
cachedirfor efficient file reuse and metadata tracking. - Flexible Destinations: Download to disk, in-memory buffer, or cache.
- Automatic Metadata: Tracks download status, timestamps, HTTP headers, file hashes, and more.
- Configurable: Supports configuration via Python dict or config file.
- Pre-commit, Linting, and CI: Ready for robust development workflows.
Installation
pip install git+https://github.com/saezlab/dlmachine.git
If your are developing:
git clone https://github.com/saezlab/dlmachine.git
cd dlmachine
poetry install
Usage
import dlmachine as dm
# Basic download to buffer
manager = dm.DownloadManager(backend='requests')
data = manager.download('https://www.google.com', dest=False)
print(data.read())
# Download to a file
manager = dm.DownloadManager(path='/tmp')
filepath = manager.download('https://www.google.com', dest='/tmp/google.html')
print(f"Downloaded to {filepath}")
# Download with cache integration
manager = dm.DownloadManager(path='/tmp')
filepath = manager.download('https://www.google.com')
print(f"Cached at {filepath}")
Architecture and Internals
The package is built around four core components:
DownloadManager: orchestrates cache lookup, backend selection, retries, and metadata updates.Descriptor: normalizes request parameters (URL, query, headers, JSON, multipart, TLS CA path).RequestsDownloaderandCurlDownloader: backend-specific implementations of the download workflow.cachedir: optional persistence layer for file reuse and download metadata.
Component Diagram
flowchart LR
U[User code] --> M[DownloadManager]
M --> D[Descriptor]
M --> C[(cachedir Cache)]
M --> B{backend}
B --> R[RequestsDownloader]
B --> P[CurlDownloader]
D --> R
D --> P
R --> OUT[Path or BytesIO]
P --> OUT
M --> OUT
Runtime Flow
- Build or accept a
Descriptor. - Resolve backend from config (
requestsby default). - Resolve destination policy:
dest='/path/file': force download to that path.dest=Noneordest=True: use cache path if cache is configured, otherwise memory buffer.dest=False: force memory buffer.
- If cache is enabled, look up best matching item with URI + relevant download params.
- If no valid cached item exists, perform download and update cache metadata (status, timestamps, response headers, checksum, size, HTTP code).
- Return either path or
io.BytesIO.
sequenceDiagram
participant U as User
participant M as DownloadManager
participant C as Cache
participant X as Backend Downloader
U->>M: download(url, dest, kwargs)
M->>M: Build Descriptor
M->>C: best_or_new(...) if cache enabled
alt cache hit
M-->>U: return cached path
else cache miss/uninitialized
M->>X: instantiate(desc, path_or_none)
M->>X: download()
X-->>M: headers, status, bytes/file
M->>C: update metadata
M-->>U: return path or BytesIO
end
Practical Usage Patterns
- In-memory processing: use
dest=Falseto getio.BytesIO. - Forced file output: pass explicit
dest='/tmp/file.ext'. - Cache-first retrieval: initialize
DownloadManager(path='/tmp/cache')and calldownload(url)withoutdest. - POST/JSON: pass
query={...}withpost=Trueorjson=True. - Multipart uploads: pass
multipart={...}with file paths included in the mapping.
API Overview
DownloadManager: Main interface for downloads and cache management.Descriptor: Describes a download (URL, headers, POST/GET, etc).CurlDownloader: PyCurl-based downloader.RequestsDownloader: Requests-based downloader.
Configuration
You can configure the download manager via keyword arguments or a config file:
dm.DownloadManager(
path='/my/cache/dir',
backend='curl', # or 'requests'
# ...other options
)
Development
- Linting:
poetry run flake8 dlmachine - Tests:
poetry run pytest - Coverage:
poetry run pytest --cov - Pre-commit: Install with
pre-commit install
License
BSD 3-Clause License
Acknowledgements
Developed by the OmniPath team at Heidelberg University Hospital.
Citation
If you use this software, please cite the repository and the OmniPath team.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dlmachine-0.0.3.tar.gz.
File metadata
- Download URL: dlmachine-0.0.3.tar.gz
- Upload date:
- Size: 140.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"NixOS","version":"26.05","id":"yarara","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6df7bcfbe2decfe414541c918e8c1a700127df091f9ce8085dbe611166c059dc
|
|
| MD5 |
2d0c7c06a5552e82756e0381e192709b
|
|
| BLAKE2b-256 |
9e1d89f4490a6216ad6cfc02a8e6598c916d5923023103e9be6c0388df2a586a
|
File details
Details for the file dlmachine-0.0.3-py3-none-any.whl.
File metadata
- Download URL: dlmachine-0.0.3-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"NixOS","version":"26.05","id":"yarara","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9eb76ebdd201dd9b88ea871a1bcb542cf2a308ec1d95af2296cd297caa6bce5
|
|
| MD5 |
592df08da471c58b6cf5af2608287828
|
|
| BLAKE2b-256 |
2c1cae2c92e80c700597480ab93c60d404c2e7c446f58bb0f35f0b1bff608b52
|