A Pyspark companion for data science tasks.
Project description
Pyspark DS Toolbox
The objective of the package is to provide tools that helps the daily work of data science with spark.
Package Structure
pyspark-ds-toolbox
├─ .git/
├─ .github
│ └─ workflows
│ └─ package-tests.yml
├─ .gitignore
├─ LICENSE.md
├─ README.md
├─ examples
│ └─ ml_eval_estimate_shapley_values.ipynb
├─ poetry.lock
├─ pyproject.toml
├─ docs/
├─ pyspark_ds_toolbox
│ ├─ __init__.py
│ ├─ causal_inference
│ │ ├─ __init__.py
│ │ ├─ diff_in_diff.py
│ │ └─ ps_matching.py
│ ├─ ml
│ │ ├─ __init__.py
│ │ ├─ data_prep.py
│ │ └─ eval.py
│ └─ wrangling.py
├─ requirements.txt
└─ tests
├─ __init__.py
├─ conftest.py
├─ data
├─ test_causal_inference
│ ├─ test_diff_in_diff.py
│ └─ test_ps_matching.py
├─ test_ml
│ ├─ test_data_prep.py
│ └─ test_ml_eval.py
├─ test_pyspark_ds_toolbox.py
└─ test_wrangling.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyspark-ds-toolbox-0.0.3a0.tar.gz
(23.8 kB
view hashes)
Built Distribution
Close
Hashes for pyspark-ds-toolbox-0.0.3a0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6fdcc47804372d45c980568e35281de91a1dbaae9a740531a52a3b4f3d702a5f |
|
MD5 | d78c3bb3fe30f8404271f9726f9c43cf |
|
BLAKE2b-256 | 46b95274e7b4a0858eed5b87b1f0f6d144a7f8fb98112bb5f91699a4e136f13f |
Close
Hashes for pyspark_ds_toolbox-0.0.3a0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 969a57ba04ab9c109a57aedf69d0902d2b9d30044d79422342691bd008338c35 |
|
MD5 | 60cab6cce72093d06b8a37cb0d8ea015 |
|
BLAKE2b-256 | c8164d3ffd2d0da36da76e263f29358f592142422de9fc96d52f62dc1dd81ee2 |