Enable a Pandas like API on PySpark
Project description
<img align="right" src="docs/img/logo.jpg">
[![buildstatus](https://travis-ci.org/sparklingpandas/sparklingpandas.svg?branch=master)](https://travis-ci.org/sparklingpandas/sparklingpandas)
SparklingPandas
==============
SparklingPandas aims to make it easy to use the distributed computing power
of PySpark to scale your data analysis with Pandas. SparklingPandas builds on
Spark's DataFrame class to give you a polished, pythonic, and Pandas-like API.
Documentation
=========
See [SparklingPandas.com.](http://sparklingpandas.com/)
Videos
=========
An early version of Sparkling Pandas was discussed in [Sparkling Pandas - using
Apache Spark to scale Pandas - Holden Karau and Juliet Hougland](https://www.youtube.com/watch?v=AcyI_V8FeIU)
Requirements
=========
The primary requirement of SparklingPandas is that you have a recent (v1.4
currently) version of Spark installed - <http://spark.apache.org> and Python
2.7.
Using
=========
Make sure you have the SPARK_HOME environment variable set correctly, as
SparklingPandas uses this for including the PySpark libraries
Other than that you can install SparklingPandas with pip and just import it.
State
=========
This is in early development. Feedback is taken seriously and is seriously appreciated.
As you can tell, us SparklingPandas are a pretty serious bunch.
Support
=========
Check out our Google group at https://groups.google.com/forum/#!forum/sparklingpandas
[![buildstatus](https://travis-ci.org/sparklingpandas/sparklingpandas.svg?branch=master)](https://travis-ci.org/sparklingpandas/sparklingpandas)
SparklingPandas
==============
SparklingPandas aims to make it easy to use the distributed computing power
of PySpark to scale your data analysis with Pandas. SparklingPandas builds on
Spark's DataFrame class to give you a polished, pythonic, and Pandas-like API.
Documentation
=========
See [SparklingPandas.com.](http://sparklingpandas.com/)
Videos
=========
An early version of Sparkling Pandas was discussed in [Sparkling Pandas - using
Apache Spark to scale Pandas - Holden Karau and Juliet Hougland](https://www.youtube.com/watch?v=AcyI_V8FeIU)
Requirements
=========
The primary requirement of SparklingPandas is that you have a recent (v1.4
currently) version of Spark installed - <http://spark.apache.org> and Python
2.7.
Using
=========
Make sure you have the SPARK_HOME environment variable set correctly, as
SparklingPandas uses this for including the PySpark libraries
Other than that you can install SparklingPandas with pip and just import it.
State
=========
This is in early development. Feedback is taken seriously and is seriously appreciated.
As you can tell, us SparklingPandas are a pretty serious bunch.
Support
=========
Check out our Google group at https://groups.google.com/forum/#!forum/sparklingpandas
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sparklingpandas-0.0.6.tar.gz
(8.5 MB
view details)
File details
Details for the file sparklingpandas-0.0.6.tar.gz
.
File metadata
- Download URL: sparklingpandas-0.0.6.tar.gz
- Upload date:
- Size: 8.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9afbad1b636275e83e680e2e4b8d5fd55723ce60850af41d009e3088df2369e0 |
|
MD5 | dfdf9778c8b51e3957aea0e0441406e7 |
|
BLAKE2b-256 | 1cbb3f8272eabcad8e2cd83f4618ffbba4985c0875a81ddced1b408fe1bace06 |