A synchronous wrapper for AIOHTTP
Project description
TinyRetriever: HTTP Requests Made Easy
TinyRetriever is a lightweight synchronous wrapper for AIOHTTP that abstracts away the complexities of making asynchronous HTTP requests. It is designed to be simple, easy to use, and efficient. TinyRetriever is built on top of AIOHTTP and AIOFiles, which are popular asynchronous HTTP client and file management libraries for Python.
📚 Full documentation is available here.
Features
TinyRetriever provides the following features:
- Concurrent Downloads: Efficiently download multiple files simultaneously
- Flexible Response Types: Get responses as text, JSON, or binary data
- Rate Limiting: Built-in per-host connection limiting to respect server constraints
- Streaming Support: Stream large files efficiently with customizable chunk sizes
- Unique Filenames: Generate unique filenames based on query parameters
- Works in Jupyter Notebooks: Easily use TinyRetriever in Jupyter notebooks without any additional setup or dependencies
- Robust Error Handling: Optional status raising and comprehensive error messages
- Performance Optimized: Uses
orjsonwhen available for up to 14x faster JSON parsing
TinyRetriever does not use nest-asyncio, instead it creates and manages a dedicated
thread for running the event loop. This allows you to use TinyRetriever in Jupyter
notebooks and other environments where the event loop is already running.
There are four main functions in TinyRetriever:
download: Download files concurrently;check_downloads: Validate existing downloaded files against remote file sizes;fetch: Fetch queries concurrently and return responses as text, JSON, or binary;unique_filename: Generate unique filenames based on query parameters.
Installation
Choose your preferred installation method:
Using pip
pip install tiny-retriever
Using micromamba
micromamba install -c conda-forge tiny-retriever
Alternatively, you can use conda or mamba.
Quick Start Guide
Please refer to the documentation for detailed usage instructions and more elaborate examples.
Downloading Files
from pathlib import Path
import tiny_retriever as terry
urls = ["https://example.com/file1.pdf", "https://example.com/file2.pdf"]
paths = [Path("downloads/file1.pdf"), Path("downloads/file2.pdf")]
# or generate unique filenames
paths = (terry.unique_filename(u) for u in urls)
paths = [Path("downloads", p) for p in paths]
# Download files concurrently
terry.download(urls, paths)
Fetching Data
urls = ["https://api.example.com/data1", "https://api.example.com/data2"]
# Get JSON responses
json_responses = terry.fetch(urls, "json")
# Get text responses
text_responses = terry.fetch(urls, "text")
# Get binary responses
binary_responses = terry.fetch(urls, "binary")
Validating Downloads
# Check if previously downloaded files match remote sizes
invalid = terry.check_downloads(urls, paths)
if invalid:
for path, expected_size in invalid.items():
print(f"{path}: local={path.stat().st_size}, expected={expected_size}")
else:
print("All files are valid!")
Generate Unique Filenames
url = "https://api.example.com/data"
params = {"key": "value"}
# Generate unique filename based on URL and parameters
filename = terry.unique_filename(url, params=params, file_extension=".json")
Advanced Usage
Custom Request Parameters
Note that you can also pass a single url and a dictionary of request parameters to the
fetch function. The default network related parameters are conservative and can be
modified as needed.
urls = "https://api.example.com/data"
kwargs = {"headers": {"Authorization": "Bearer token"}}
responses = terry.fetch(
urls,
return_type="json",
request_method="post",
request_kwargs=kwargs,
limit_per_host=2,
timeout=30,
)
Error Handling
from tiny_retriever import fetch, ServiceError
try:
responses = fetch(urls, return_type="json", raise_status=True)
except ServiceError as e:
print(f"Request failed: {e}")
Configuration
TinyRetriever can be configured through environment variables:
MAX_CONCURRENT_CALLS: Maximum number of concurrent requests (default: 10)- Default chunk size for downloads: 1MB
- Default timeout: 5 minutes
- Default connections per host: 4
Contributing
We welcome contributions! Please see the contributing section for guidelines and instructions.
License
This project is licensed under the terms of the MIT license.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tiny_retriever-0.2.1.tar.gz.
File metadata
- Download URL: tiny_retriever-0.2.1.tar.gz
- Upload date:
- Size: 511.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53596667def968f2317153019d0856600013226e5c4391fa2cdfa48a7d82f988
|
|
| MD5 |
2e5e6a02a978ab0815cb97268ba1bee4
|
|
| BLAKE2b-256 |
6074ce68a10be09d19d75e2e280d77a91417964c4674af2196e3b43c6ee1ca39
|
Provenance
The following attestation bundles were made for tiny_retriever-0.2.1.tar.gz:
Publisher:
release.yml on cheginit/tiny-retriever
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tiny_retriever-0.2.1.tar.gz -
Subject digest:
53596667def968f2317153019d0856600013226e5c4391fa2cdfa48a7d82f988 - Sigstore transparency entry: 934375451
- Sigstore integration time:
-
Permalink:
cheginit/tiny-retriever@a80c8fb2e2f1f627b9cd38596fb6915fc6eec122 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/cheginit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a80c8fb2e2f1f627b9cd38596fb6915fc6eec122 -
Trigger Event:
push
-
Statement type:
File details
Details for the file tiny_retriever-0.2.1-py3-none-any.whl.
File metadata
- Download URL: tiny_retriever-0.2.1-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2083eb53da3d5281e1515f31066bac1320c9009c6184594f272c134528c60cd2
|
|
| MD5 |
6ab5887dcdbb0f5b5c678033a9af3ffe
|
|
| BLAKE2b-256 |
cd94e7fd85a0c4ffd8a1a92ea239ff3beb6c6962197dc821c8660c053e806ae9
|
Provenance
The following attestation bundles were made for tiny_retriever-0.2.1-py3-none-any.whl:
Publisher:
release.yml on cheginit/tiny-retriever
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tiny_retriever-0.2.1-py3-none-any.whl -
Subject digest:
2083eb53da3d5281e1515f31066bac1320c9009c6184594f272c134528c60cd2 - Sigstore transparency entry: 934375540
- Sigstore integration time:
-
Permalink:
cheginit/tiny-retriever@a80c8fb2e2f1f627b9cd38596fb6915fc6eec122 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/cheginit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a80c8fb2e2f1f627b9cd38596fb6915fc6eec122 -
Trigger Event:
push
-
Statement type: