Skip to main content

A library built on top of Ray to make embarassingly parallel tasks embarassingly easy

Project description

R-3PO -- Richard's Parallel Processing Pipeline

Introduction

A library built on top of Ray to make embarassingly parallel problems embarassingly easy.

Suppose you have lots of data files that need to be processed in the exact same way with the same function. And suppose you want to save the results of that processing into a CSV file. This is an embarassingly parallel problem: it should be embarassingly easy.

And that's what R3PO aims to deliver: R3PO lets you do it with a config.yaml file and three lines of code.

config.yaml:

job_name: count_produce
output_path: /home/lieu/dev/r3po/sample/output_dir
processes: 2
source_file_part: .json
source_path: /home/lieu/dev/r3po/sample/produce_log
working_dir: /home/lieu/dev/r3po/sample/working_dir

main.py:

from r3po import jobbuilder, jobrunner
# Import the function that will be called by your processes
from count_fruits import count_fruits

CONFIG_YAML_FP = './config.yaml'

# Build jobs
jobbuilder.build_jobs(CONFIG_YAML_FP)

# Run jobs
jobrunner.run_jobs(CONFIG_YAML_FP, count_fruits)

This will run the function count_fruits on all the .json files in source_path, and save the results as CSVs in output_path (one row per JSON file).

That's it! R3PO automatically handles the distribution of tasks to processes, saves your progress so you can stop and restart the job anytime, and logs all errors automatically.

Quickstart (worked example)

[TODO] -- but check the sample directory

Installation

pip3 install r3po

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

r3po-0.2.0.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

r3po-0.2.0-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

File details

Details for the file r3po-0.2.0.tar.gz.

File metadata

  • Download URL: r3po-0.2.0.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.2

File hashes

Hashes for r3po-0.2.0.tar.gz
Algorithm Hash digest
SHA256 49e07fe6788a268e378ebf4207e2b479039d33bf0125c9237306b4a439b3df81
MD5 a6c049cf41a38f3f596c61b6e8df3c22
BLAKE2b-256 cb8fde74cafbce1f340f3f51ac8ee7dbd1f13836e05fbb95d8cdfbed18d67d1e

See more details on using hashes here.

File details

Details for the file r3po-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: r3po-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 5.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.2

File hashes

Hashes for r3po-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 df1fe5f2251ccf7d80d46b40406273d6b8bdb158969d8e6dbe2e43b817c75ea4
MD5 c9244eca202e6c8387544c8320cda30b
BLAKE2b-256 8ab354c265929d9a83999e7e7687f45ee0dd51b99249218ae5fd9e5648a4d56b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page