A library built on top of Ray to make embarassingly parallel tasks embarassingly easy
Project description
R-3PO -- Richard's Parallel Processing Pipeline
Introduction
A library built on top of Ray to make embarassingly parallel problems embarassingly easy.
Suppose you have lots of data files that need to be processed in the exact same way with the same function. And suppose you want to save the results of that processing into a CSV file. This is an embarassingly parallel problem: it should be embarassingly easy.
And that's what R3PO aims to deliver: R3PO lets you do it with a config.yaml
file and three lines of code.
config.yaml:
job_name: count_produce
output_path: /home/lieu/dev/r3po/sample/output_dir
processes: 2
source_file_part: .json
source_path: /home/lieu/dev/r3po/sample/produce_log
working_dir: /home/lieu/dev/r3po/sample/working_dir
main.py:
from r3po import jobbuilder, jobrunner
# Import the function that will be called by your processes
from count_fruits import count_fruits
CONFIG_YAML_FP = './config.yaml'
# Build jobs
jobbuilder.build_jobs(CONFIG_YAML_FP)
# Run jobs
jobrunner.run_jobs(CONFIG_YAML_FP, count_fruits)
This will run the function count_fruits on all the .json files
in source_path, and save the results as CSVs in output_path
(one row per JSON file).
That's it! R3PO automatically handles the distribution of tasks to processes, saves your progress so you can stop and restart the job anytime, and logs all errors automatically.
Quickstart (worked example)
[TODO] -- but check the sample directory
Installation
pip3 install r3po
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file r3po-0.2.0.tar.gz.
File metadata
- Download URL: r3po-0.2.0.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49e07fe6788a268e378ebf4207e2b479039d33bf0125c9237306b4a439b3df81
|
|
| MD5 |
a6c049cf41a38f3f596c61b6e8df3c22
|
|
| BLAKE2b-256 |
cb8fde74cafbce1f340f3f51ac8ee7dbd1f13836e05fbb95d8cdfbed18d67d1e
|
File details
Details for the file r3po-0.2.0-py3-none-any.whl.
File metadata
- Download URL: r3po-0.2.0-py3-none-any.whl
- Upload date:
- Size: 5.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df1fe5f2251ccf7d80d46b40406273d6b8bdb158969d8e6dbe2e43b817c75ea4
|
|
| MD5 |
c9244eca202e6c8387544c8320cda30b
|
|
| BLAKE2b-256 |
8ab354c265929d9a83999e7e7687f45ee0dd51b99249218ae5fd9e5648a4d56b
|