A generic library for concurrent/parallel task execution with idempotent caching.
Project description
Pararun
Pararun is a lightweight, fault-tolerant Python library for concurrent and parallel task execution. It simplifies running tasks using asyncio, multiprocessing, or threading, with built-in support for persistent caching (idempotency), progress bars, and streaming large datasets.
Features
- 🚀 Unified API: Simple
pr.mapfor parallel processing andpr.aio_mapfor async tasks. - 💾 Idempotent Caching: Automatically skips processed items by checking a JSONL cache file. Perfect for resumable long-running jobs.
- 🌊 Streaming Support: Handles large datasets (generators) with controlled memory usage using backpressure.
- 📊 Progress Monitoring: Integrated
tqdmprogress bars. - 🛡️ Fault Tolerance: Safely handles crashes by flushing results to disk periodically.
Installation
pip install pararun
Quick Start
1. Parallel Processing (CPU/IO Bound)
Use pr.map for blocking functions. It uses concurrent.futures implementation.
import pararun as pr
import time
def process_file(filename):
time.sleep(0.1) # Simulate blocking work
return {"id": filename, "status": "done"}
# Works with Lists or Generators
files = (f"data_{i}.txt" for i in range(100))
# Result is saved to 'results.jsonl' automatically
pr.map(
func=process_file,
iterable=files,
n_workers=4,
cache_path="results.jsonl"
)
2. Async Processing (AsyncIO)
Use pr.aio_map for native async functions.
import pararun as pr
import asyncio
async def fetch_url(item):
await asyncio.sleep(0.1) # Simulate network request
return {"id": item["url"], "status": 200}
async def main():
urls = [{"url": f"https://example.com/{i}"} for i in range(100)]
await pr.aio_map(
func=fetch_url,
iterable=urls,
n_workers=10,
cache_path="async_results.jsonl"
)
if __name__ == "__main__":
asyncio.run(main())
Advanced Usage
Idempotency & Resuming
When cache_path is provided, pararun reads the file (if it exists) to verify which items have already been processed.
By default, it assumes the output items contain an "id" field. You can customize this field using the key_field parameter:
pr.map(..., cache_path="cache.jsonl", key_field="filename")
- Run 1: Process 50% of items -> Crash.
- Run 2: Point to same
cache_path.pararunskips the first 50% and resumes from where it left off.
Streaming Large Datasets
pararun is designed to be memory efficient. It uses bounded queues (semaphores) to ensure that even if you pass a generator with 100M items, only n_workers * 2 items are held in memory at any time.
Development
Install dependencies and run tests:
# Install package in editable mode
pip install -e .
# Run tests
python -m pytest
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pararun-0.1.0.tar.gz.
File metadata
- Download URL: pararun-0.1.0.tar.gz
- Upload date:
- Size: 6.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
636e721acb7b7927d08e60fc58ac86b4ec55bedab3f2cb943940bf6a155cacb0
|
|
| MD5 |
6aae6c3b19ea6d8403987c4774c80696
|
|
| BLAKE2b-256 |
c20fa8e8b4b105573b73ec24fbc011f4256a57d1e0c10556107d78887dcc7b0d
|
File details
Details for the file pararun-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pararun-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f0227eadfeba080edadeaff2fa27a99a4582ed72c191b896a94b4136272167e
|
|
| MD5 |
b95555a078b53073e32032c90f0c9b7a
|
|
| BLAKE2b-256 |
80dd82af70c578bef471d5219b01bedc037f343c5332918f891bc8aa6f89da25
|