Skip to main content
Help the Python Software Foundation raise $60,000 USD by December 31st!  Building the PSF Q4 Fundraiser

Helpers & syntax sugar for PySpark.

Project description

Sparkly PyPi Version Sparkly Build Status Documentation Status

Helpers & syntax sugar for PySpark. There are several features to make your life easier:

  • Definition of spark packages, external jars, UDFs and spark options within your code;
  • Simplified reader/writer api for Cassandra, Elastic, MySQL, Kafka;
  • Testing framework for spark applications.

More details could be found in the official documentation.


Sparkly itself is easy to install:

pip install sparkly

The tricky part is pyspark. There is no official distribution on PyPI. As a workaround we can suggest:

  1. Use env variable PYTHONPATH to point to your Spark installation, something like:

    export PYTHONPATH="/usr/local/spark/python/lib/"
  2. Use our file for pyspark. Just add this to your requirements.txt:

    -e git+

Here in Tubular, we published pyspark to our internal PyPi repository.

Getting Started

Here is a small code snippet to show how to easily read Cassandra table and write its content to ElasticSearch index:

from sparkly import SparklySession

class MySession(SparklySession):
    packages = [

if __name__ == '__main__':
    spark = MySession()
    df = spark.read_ext.cassandra('localhost', 'my_keyspace', 'my_table')
    df.write_ext.elastic('localhost', 'my_index', 'my_type')

See the online documentation for more details.


To run tests you have to have docker and docker-compose installed on your system. If you are working on MacOS we highly recommend you to use docker-machine. As soon as the tools mentioned above have been installed, all you need is to run:

make test

Supported Spark Versions

At the moment we support:

sparkly >= 2.7 | Spark 2.4.x
sparkly 2.x | Spark 2.0.x and Spark 2.1.x and Spark 2.2.x
sparkly 1.x | Spark 1.6.x

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for sparkly, version 2.8.2
Filename, size File type Python version Upload date Hashes
Filename, size sparkly-2.8.2.tar.gz (33.7 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page