A Pyspark companion for data science tasks.
Project description
Pyspark DS Toolbox
The objective of the package is to provide a set of tools that helps the daily work of data science with spark. The documentation can be found here.
Installation
Directly from PyPi:
pip install pyspark-ds-toolbox
or from github:
pip install git+https://github.com/viniciusmsousa/pyspark-ds-toolbox.git
Organization
The package is currently organized in a structure based on the nature of the task, such as data wrangling, model/prediction evaluation, and so on.
pyspark_ds_toolbox # Main Package
├─ causal_inference # Sub-package dedicated to Causal Inferece
│ ├─ diff_in_diff.py # Module Diff in Diff
│ └─ ps_matching.py # Module Propensity Score Matching
├─ ml # Sub-package dedicated to ML
│ ├─ data_prep.py # Module for Data Preparation
│ └─ eval.py # Module for model/prediction evaluation
└─ wrangling.py # Module for general Data Wrangling
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyspark-ds-toolbox-0.0.3a1.tar.gz
(24.2 kB
view hashes)
Built Distribution
Close
Hashes for pyspark-ds-toolbox-0.0.3a1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e4d125dd5f11217a102ae0c80ad4fe0256f72a839a809dc58006373d1d6daef |
|
MD5 | 6e0af963a7cae3d3732df053cf465bcf |
|
BLAKE2b-256 | 90fd79f9f88396e9e43239de4fbcda52768e359838fb68a636e0f03e19576826 |
Close
Hashes for pyspark_ds_toolbox-0.0.3a1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e50a54f08a773c87f89ae97d4a180a28022550223ca7714b5ca7a3940ece616 |
|
MD5 | 8bd0298311384c519bc14eab773f1c12 |
|
BLAKE2b-256 | a8c7e1a5e18d605866b95af39cdcf8ae14db0ca5fd35108ed3025926ea4a2e55 |