python-parallel-collections

parallel implementations of collections with support for map/reduce style operations

These details have not been verified by PyPI

Project links

Homepage

Natural Language
- English
Operating System
- OS Independent
Programming Language
- Python

Project description

###Python Parallel Collections
####Implementations of dict and list which support parallel map/reduce style operations

####Who said Python was not setup for multicore computing?
In this package you'll find very simple parallel implementations of list and dict. The parallelism uses the [Python 2.7 backport](http://pythonhosted.org/futures/#processpoolexecutor-example) of the [concurrent.futures](http://docs.python.org/dev/library/concurrent.futures.html) package. If you can define your problem in terms of map/reduce/filter/flatten operations, it will run on several parallel Python processes on your machine, taking advantage of multiple cores.
Otherwise these datastructures are equivalent to the non-parallel ones found in the standard library.

####Examples

```python
>>> def double(i):
... return i*2
...
>>> list_of_list = ParallelList([[1,2,3],[4,5,6]])
>>> flat_list = list_of_list.flatten()
[1, 2, 3, 4, 5, 6]
>>> list_of_list
[[1, 2, 3], [4, 5, 6]]
>>> flat_list.map(double)
[2, 4, 6, 8, 10, 12]
>>> list_of_list.flatmap(double)
[2, 4, 6, 8, 10, 12]
```

As you see every method call returns a new collection, instead of changing the current one.
The exception is the foreach method, which is equivalent to map but instead of returning a new collection it operates directly on the
current one and returns `None`.
```python
>>> flat_list
[1, 2, 3, 4, 5, 6]
>>> flat_list.foreach(double)
None
>>> flat_list
[2, 4, 6, 8, 10, 12]
```

Since every operation (except foreach) returns a collection, these can be chained.
```python
>>> list_of_list = ParallelList([[1,2,3],[4,5,6]])
>>> list_of_list.flatmap(double).map(str)
['2', '4', '6', '8', '10', '12']
```

####Regarding lambdas and closures
Sadly lambdas, closures and partial functions cannot be passed around multiple processes, so every function that you pass to the collection methods needs to be defined using the def statement. If you want the operation to carry extra state, use a class with a `__call__` method defined.
```python
>>> class multiply(object):
... def __init__(self, factor):
... self.factor = factor
... def __call__(self, item):
... return item * self.factor
...
>>> multiply(2)(3)
6
>>>list_of_list = ParallelList([[1,2,3],[4,5,6]])
>>> list_of_list.flatmap(multiply(2))
[2, 4, 6, 8, 10, 12]
```

###Quick example of flatmap and filter for both collections

####FlatMap

Functions passed to the flatmap method of a list will be passed every element in the list and should return a single element. For a dict, the function will receive a tuple (key, values) for every key in the dict, and should equally return a two element sequence.

```python
>>>def double(item):
... return item * 2
...
>>> list_of_list = ParallelList([[1,2,3],[4,5,6]])
>>> list_of_list.flatmap(double).map(str)
['2', '4', '6', '8', '10', '12']
>>> def double_dict(item):
... k,v = item
... try:
... return [k, [i *2 for i in v]]
... except TypeError:
... return [k, v * 2]
...
>>> d = ParallelDict(zip(range(2), [[[1,2],[3,4]],[3,4]]))
>>> d
{0: [[1, 2], [3, 4]], 1: [3, 4]}
>>> flat_mapped = d.flatmap(double_dict)
>>> flat_mapped
{0: [2, 4, 6, 8], 1: [6, 8]}
```

####Reduce
Reduce accepts an optional initializer, which will be passed as the first argument to every call to the function passed as reducer
```python
>>> def group_letters(all, letter):
... all[letter].append(letter)
... return all
...
>>>p = ParallelList(['a', 'a', 'b'])
>>>reduced = p.reduce(group_letters, defaultdict(list))
>>>reduced
{'a': ['a', 'a'], 'b': ['b']}
```

####Filter
The Filter method should be passed a predicate, which means a function that will return True or False and will be called once for every element in the list and for every (key, values) in a dict.
```python
>>> def is_digit(item):
... return item.isdigit()
...
>>> p = ParallelList(['a','2','3'])
>>> pred = is_digit
>>> filtered = p.filter(pred)
>>> filtered
['2', '3']

>>>def is_digit_dict(item):
... return item[1].isdigit()
...
>>>p = ParallelDict(zip(range(3), ['a','2', '3',]))
>>>p
{0: 'a', 1: '2', 2: '3'}
>>>pred = is_digit_dict
>>>filtered = p.filter(pred)
>>>filtered
{1: '2', 2: '3'}
```

Project details

These details have not been verified by PyPI

Project links

Homepage

Natural Language
- English
Operating System
- OS Independent
Programming Language
- Python

Release history Release notifications | RSS feed

2.0.0

Oct 22, 2016

1.2.1

Mar 17, 2015

1.2

Mar 14, 2015

1.1

Mar 11, 2015

1.0

Dec 8, 2014

0.2.3

Oct 26, 2014

0.2.2

Oct 26, 2014

0.2.1

Jun 11, 2014

0.2

Jun 9, 2014

0.1.9.3

Dec 4, 2013

0.1.9.2

Nov 29, 2013

0.1.9.1

Nov 29, 2013

0.1.9

Nov 29, 2013

0.1.8

Nov 27, 2013

0.1.7

Nov 27, 2013

0.1.6

Nov 27, 2013

0.1.5

Nov 26, 2013

0.1.4

Nov 26, 2013

0.1.3

Nov 26, 2013

0.1.2

Nov 26, 2013

0.1.1

Nov 25, 2013

This version

0.1

Nov 25, 2013

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-parallel-collections-0.1.tar.gz (4.2 kB view details)

Uploaded Nov 25, 2013 Source

File details

Details for the file python-parallel-collections-0.1.tar.gz.

File metadata

Download URL: python-parallel-collections-0.1.tar.gz
Upload date: Nov 25, 2013
Size: 4.2 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for python-parallel-collections-0.1.tar.gz
Algorithm	Hash digest
SHA256	`c834b9d9006d60e7d060f11ee7e3b9c67834aaabd33a2b7cd1d5ed150c79a957`
MD5	`cf68407c6693ff1a749526acf7f0b47a`
BLAKE2b-256	`2b22c88c84eddd3cacaf6812a3b7490550ae807af266501ba9141094f79b33a8`

See more details on using hashes here.

python-parallel-collections 0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes