Skip to main content

Optimus is the missing library for cleaning and preprocessing data in a distributed fashion with pyspark.

Project description

Optimus is the missing library for cleaning and pre-processing data in a distributed fashion. It uses all the power of Apache Spark (optimized via Catalyst) to do it. It implements several handy tools for data wrangling and munging that will make your life much easier. The first obvious advantage over any other public data cleaning library is that it will work on your laptop or your big cluster, and second, it is amazingly easy to install, use and understand.

Requirements:

  • Apache Spark 1.6

  • Python 3.5

## Installation:

In your terminal just type:

$ pip install optimuspyspark

Contributors:

  • Original Developers: Andrea Rosales, Hugo Reyes.

  • Principal developer and maintainer: Favio Vázquez.

License:

Apache 2.0 © Iron

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optimuspyspark-0.5.1.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

optimuspyspark-0.5.1-py3-none-any.whl (27.4 kB view details)

Uploaded Python 3

File details

Details for the file optimuspyspark-0.5.1.tar.gz.

File metadata

File hashes

Hashes for optimuspyspark-0.5.1.tar.gz
Algorithm Hash digest
SHA256 92e247ce23bed3625459ac278b0481d566476590252aa5681923c13b322e1252
MD5 f2962b2f6b65addebbb73fbfbec7e45b
BLAKE2b-256 79deb44577749c9b94354d05947cc0511035eff377b3b4581fff3a7a51d49f1d

See more details on using hashes here.

File details

Details for the file optimuspyspark-0.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for optimuspyspark-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 40825c4e533bca49fc2fcbd8a643f42dfdf2b8c2dffa0273e43cd8550b60b904
MD5 d26edaf175e37aaf7f7a347418396eff
BLAKE2b-256 ca93b145e904d727317640e59e61f8e8ce197fe2233118dd81e5b470e0988d55

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page