Enable Pandas on PySpark
Project description
![logo](img/logo.jpg)
[![buildstatus](https://travis-ci.org/sparklingpandas/sparklingpandas.svg?branch=master)](https://travis-ci.org/sparklingpandas/sparklingpandas)
==============
SparklingPandas
==============
SparklingPandas aims to make it easy to use the distributed computing power
of PySpark to scale your data analysis with Pandas. SparklingPandas builds on
Spark's DataFrame class to give you a polished, pythonic, and Pandas like API.
Documentation
=========
None (right now).
Videos
=========
An early version of Sparkling Pandas was discussed in [Sparkling Pandas - using
Apache Spark to scale Pandas - Holden Karau and Juliet Hougland](https://www.youtube.com/watch?v=AcyI_V8FeIU)
Requirements
=========
The primary requirement of SparklingPandas is that you have a recent (v1.4
currently) version of Spark installed - <http://spark.apache.org> and Python
2.7.
Using
=========
Make sure you have the SPARK_HOME enviroment variable set correctly, as
SparklingPandas uses this for including the PySpark libraries
Other than that you can install SparklingPandas with pip and just import it.
State
=========
This is in early development. Feedback is taken seriously and is seriously appreciated.
As you can tell Us SparklingPandas are a pretty serious bunch.
Support
=========
Check out our Google group at https://groups.google.com/forum/#!forum/sparklingpandas
[![buildstatus](https://travis-ci.org/sparklingpandas/sparklingpandas.svg?branch=master)](https://travis-ci.org/sparklingpandas/sparklingpandas)
==============
SparklingPandas
==============
SparklingPandas aims to make it easy to use the distributed computing power
of PySpark to scale your data analysis with Pandas. SparklingPandas builds on
Spark's DataFrame class to give you a polished, pythonic, and Pandas like API.
Documentation
=========
None (right now).
Videos
=========
An early version of Sparkling Pandas was discussed in [Sparkling Pandas - using
Apache Spark to scale Pandas - Holden Karau and Juliet Hougland](https://www.youtube.com/watch?v=AcyI_V8FeIU)
Requirements
=========
The primary requirement of SparklingPandas is that you have a recent (v1.4
currently) version of Spark installed - <http://spark.apache.org> and Python
2.7.
Using
=========
Make sure you have the SPARK_HOME enviroment variable set correctly, as
SparklingPandas uses this for including the PySpark libraries
Other than that you can install SparklingPandas with pip and just import it.
State
=========
This is in early development. Feedback is taken seriously and is seriously appreciated.
As you can tell Us SparklingPandas are a pretty serious bunch.
Support
=========
Check out our Google group at https://groups.google.com/forum/#!forum/sparklingpandas
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sparklingpandas-0.0.4.tar.gz
(18.1 MB
view hashes)