DPark

Python clone of Spark, MapReduce like computing framework supporting iterative algorithms.

These details have not been verified by PyPI

Project links

Project description

DPark is a Python clone of Spark, MapReduce(R) alike computing framework supporting iterative computation.

Example for word counting (wc.py):

import dpark
file = dpark.textFile("/tmp/words.txt")
words = file.flatMap(lambda x:x.split()).map(lambda x:(x,1))
wc = words.reduceByKey(lambda x,y:x+y).collectAsMap()
print wc

This script can run locally or on a Mesos cluster without any modification, just using different command-line arguments:

$ python wc.py
$ python wc.py -m process
$ python wc.py -m host[:port]

See examples/ for more use cases.

Some more docs (in Chinese): https://github.com/jackfengji/test_pro/wiki

DPark can run with Mesos 0.9 or higher.

If a $MESOS_MASTER environment variable is set, you can use a shortcut and run DPark with Mesos just by typing

$ python wc.py -m mesos

$MESOS_MASTER can be any scheme of Mesos master, such as

$ export MESOS_MASTER=zk://zk1:2181,zk2:2181,zk3:2181/mesos_master

In order to speed up shuffling, you should deploy Nginx at port 5055 for accessing data in DPARK_WORK_DIR (default is /tmp/dpark), such as:

server {
        listen 5055;
        server_name localhost;
        root /tmp/dpark/;
}

Mailing list: dpark-users@googlegroups.com (http://groups.google.com/group/dpark-users)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

Jul 27, 2018

0.4.2

Mar 19, 2018

0.4.1

Mar 9, 2017

0.4.0

Dec 6, 2016

0.3.5

Oct 27, 2016

0.3.3

Sep 20, 2016

0.3.2

Jun 7, 2016

0.3.1

May 25, 2016

0.2.9

Mar 15, 2016

0.2.8

Mar 4, 2016

0.2.7

Feb 24, 2016

0.2.6

Jan 25, 2016

0.2.5

Jan 25, 2016

0.2.4

Jan 19, 2016

0.2.3

Jan 18, 2016

0.2.2

Jan 18, 2016

0.2.1

Jan 4, 2016

This version

0.2

Dec 24, 2015

0.1

Dec 24, 2015

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DPark-0.2.tar.gz (97.9 kB view details)

Uploaded Dec 24, 2015 Source

File details

Details for the file DPark-0.2.tar.gz.

File metadata

Download URL: DPark-0.2.tar.gz
Upload date: Dec 24, 2015
Size: 97.9 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for DPark-0.2.tar.gz
Algorithm	Hash digest
SHA256	`ca9881a639a8755273b738ce7768dff612e21f74e2e5ad34bfefb58d5af8ce2b`
MD5	`8cd3ff11d400754e100b6bc2b4384001`
BLAKE2b-256	`b5ea2d330c8eb188308417efd0aaa70dc8efc27204d3305b68f8155c43a8addf`

See more details on using hashes here.

DPark 0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes