pluribus·PyPI

A pure-python highly-distributed MapReduce cluster.

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Environment
- Console
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python
Topic
- Utilities

Project description

Having just finished reading the original Google MapReduce paper, I obviously felt the need to try to implement such a system in Python.

My goals are to implement enough of the functionality described in the paper to be usable, though I strongly warn against ever using this code for anything real.

Since one of the goals (see Goals, below) is simplicity from an end-user standpoint, I am following some of Kenneth Reitz’s advice and starting with a readme and documentation.

Examples

The canonical word-count example:

# myjob.py
from pluribus import job


@job.map_
def emit_words(key, value):
    # key: document name
    # value: document contents
    for word in value.split():
        yield word, 1


@job.reduce_
def sum_occurences(key, values):
    # key: a word
    # values: a list of counts
    return sum(values)

Assuming you’re running everything on one host, you can ignore the network connection information.

Start a pluribus master:

$ pluribus master

Start a pluribus worker (or several hundred):

$ pluribus worker

On the master or on another machine that can talk to the master:

$ pluribus job myjob
# ... wait
<results>

Goals

Explicit goals are:

Simple to use, both as an administrator and end-user.
Well-documented.
Robust to worker failure.
Fast-enough.
Use only the Python (2.7+) standard library (at least to run).

Explicit non-goals are:

Be a filesystem.
Robust to master failure.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Environment
- Console
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python
Topic
- Utilities

Release history Release notifications | RSS feed

This version

0.0.1

May 22, 2013

pluribus 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Examples

Goals

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed