Skip to main content

A library for parallel execution of Python code in the Ufora runtime

Project description

pyfora - Compiled, parallel python

pyfora is the client package for Ufora_ - a compiled, automatically parallel Python for data science
and numerical computing.

Ufora achieves speed and scale by reasoning about your python code to compile
it to machine code (so it's fast) and find parallelism in it (so that it scales). The Ufora
runtime is fully fault tolerant, and handles all the details of data
management and task scheduling transparently to the user.

The Ufora runtime is invoked by enclosing code in a "ufora.remote" block. Code
and objects are shipped to a Ufora cluster and executed in parallel across
those machines. Results are then injected back into the host python
environment either as native python objects, or as handles (in case the
objects are very large). This allows you to pick the subset of your code that
will benefit from running in Ufora - the remainder can run in your regular
python environment.

For all of this to work properly, Ufora places one major restriction on
the code that it runs: it must be "pure", meaning that it cannot modify data
structures or have side effects. This restriction allows the Ufora runtime to
agressively reorder calculations, which is crucial for
parallelism, and allows it to perform compile-time
optimizations than would not be possible otherwise. For more on the subset of python
that Ufora supports, see `python restrictions`_.

.. _python restrictions:


The pyfora client is a pure python package and can be installed by running:

.. code::

pip install pyfora

Getting Started with Ufora

The ufora backend is available as a docker image that can be run locally on your machine, or in a
cluster of machines on a local network or in the cloud.

- `Getting started with local Ufora`_
- `Getting started with Ufora on AWS`_
- `Running Ufora on a local cluster`_

.. _Getting started with local Ufora:
.. _Getting started with Ufora on AWS:
.. _Running Ufora on a local cluster:


Pyfora is developed and maintained by the Ufora_ team. Find us on Github_.

- `Distribute`_

.. _Distribute:

.. _Ufora:
.. _Github:

Pyfora News


* Release date: Feb-24-2016

* [feature]: Supporting member initialization in base-class __init__ functions
* [feature]: Adding support for numpy.linalg.svd
* [bug #208]: Can't convert bound instance methods from base classes


*Release date: Feb-17-2016

* [feature #78]: Improved error reporting for untranslatable code
* [feature #133]: Initial support for object inheritance
* [enhancement]: New compiler implementation produces much more efficient code
* [enhancement]: Implementation of beta function better matches scipy


* speed up fora compiler
* speed up pyfora data upload time
* fix bug in hyp2f1
* hook up many more scipy/numpy special (math) functions


*Release date: Jan-27-2016

* Make scipy optional


*Release date: Jan-26-2016

* Add support for scipy.special.gamma and scipy.special.hyp2f1


*Release date: Jan-22-2016

* [bug #17]: Can’t call static methods on instances in fora, can in python
* [bug #83]: Possibly Uninitialized Variable Analysis cannot deal with complex data-flow
* [bug #107]: Bad error message when non-bound function gets too many call args
* [feature #124]: Implement `assert`
* [bug #134]: PyInt.fora doesn't have an implementation of __mod__
* [bug #138]: Dictionary comprehensions don't work
* [feature #153]: Read files from local file-system
* [feature #154]: Logistic regression in pyfora
* [feature #155]: Gradient-boosted trees in pyfora
* [feature #159]: Add 'add worker' command to pyfora_aws
* [bug #163]: pyfora_aws has problems if "ufora" security group is already created
* [feature #168]: No feedback in pyfora_aws when things go wrong on an instance
* [bug #170]: Confusing error message when client and server versions don't match
* [feature #172]: Operator Coalescing
* [bug #176]: `isinstance` bug
* [feature #179]: Inline fora in pyfora


*Release date: Dec-10-2015

* [feature] provide pyfora wrapper for scipy.special.beta
* [feature] provide pyfora wrapper for math.log
* [feature] perf improvements for mixin binding calculations.


*Release date: Dec-08-2015

* [bug #165]: Set good default value for EXTERNAL_DATASET_LOADER_SERVICE_THREADS.
* [bug #162]: pyfora_aws docs indicate that ec2 region is optional, but parameter is in fact required.
* [feature]: pyfora_aws should propagate AWS credentials.
* [bug #145]: Cannot access data in S3.
* [bug #144]: pyfora_aws raises exception when --num-instances is 1.
* [bug #140]: ufora-worker launched with pyfora_aws only uses 8GB of memory.
* [bug #136]: Collisions with pandas and numpy on case-insensitive file-systems.
* [bug #127]: Correctly propegating communication errors up to Executor.
* [feature]: Support @property decorator.
* [feature]: Improved download performance of large lists of small objects.
* [bug #122]: Wrong exception type from `list + non_list`.
* [bug #120]: Failure when trying to convert a list of mapped functions.
* [bug #119]: Can't convert bound instance methods.
* [bug #116]: Builtin "reduce" function is not parallelizable when applied over lists, xrange, etc.
* [bug #115]: Fixing __getitem__ for strings and tuples
* [bug #111]: Wrong exception when accessing unbound variables.
* [bug #110]: Incorrect conversion of class functions in user-defined classes.
* [bug #109]: list __getitem__ doesn't throw with step 0
* [feature]: Implement `map` builtin
* [feature]: Support `isinstance` on user-defined classes.
* [feature]: Add versioning scheme to protocol.
* [feature]: Add support for the python REPL.
* [bug #90]: Improved error message for unbound free variables.
* [bug #89]: Ctrl+C doesn't break out of `with` block.
* [bug #68]: Disallow `return` statements in pyfora `with` blocks.
* [bug #67]: tuple unpacking doesn't work
* [feature]: basic linear regression on data-frames
* [feature]: basic CSV parsing
* [feature]: basic data-frames
* [bug #59]: `sequence(0)` not iterable
* [bug #47]: int/float mismatch in `**` operator
* [bug #21]: certain python variables "survive" longer than fora values

*Known Issues:

* `def` order is important in non-module function definition (closures). If functions
`g()` and `h()` are defined inside of function `f` and `g()` calls `h()`, then `def h():` must
appear BEFORE `def g():`.
This also implies that mutually-recursive functions are only possible at module or class level.

* Class static methods cannot be used as values. They can be invoked, but it's not possible
to pass a class static method as an argument to another function.

* Named argument calls are not supported. If you have a function `def f(x):...` you can call it as
`f(42)` but you can't use `f(x=42)`.

* Keyword arguments are not supported.

* Class members can only be initialized inside of `__init__`. If `__init__` calls another function
that initializes members, those members will not be seen by pyfora.

* `return` statements not allowed in `__init__()`

* @classmethod decorator is not supported.

* No support for `*args`.

* `assert` is not implemented.

* Bad error message when using `self` inside of `__init__` for things other than setting or getting
members. For example, calling `str(self)` inside of `__init__` results in
"PythonToForaConversionError: An internal error occurred: we didn't provide a definition for the following variables: ['self'].
Most likely, there is a mismatch between our analysis of the python code and the generated FORA code underneath. Please file a bug report."

* No support for object inheritance.


*Release date: Nov-06-2015

* Initial release of pyfora!
* Includes support for core language features and builtin types.
* Some support for builtin functions like all, any, sum, etc.
* module and pyfora_aws script help setup a Ufora cluster in EC2.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pyfora, version 0.4.1
Filename, size File type Python version Upload date Hashes
Filename, size pyfora-0.4.1-py2-none-any.whl (194.1 kB) File type Wheel Python version 2.7 Upload date Hashes View
Filename, size pyfora-0.4.1.tar.gz (104.3 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page