For an easy implementation of spark's machine learning library

These details have not been verified by PyPI

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 1 - Planning
Intended Audience
- Developers
Operating System
Programming Language
- Python :: 3.8

Project description

SparkAutoML

PyPI PyPI - Python Version GitHub GitHub language count

This is a TRIAL version

The main idea of the package is to easily build and deploy pyspark's machine learning library

If you want to contribute please reach out to fahadakbar@gmail.com

Thank you !

https://user-images.githubusercontent.com/19522276/163658399-17786b20-0208-44ff-b111-98b74ab34d25.mp4

How to install

PySpark's version 3.2.1 or higher is required

pip install SparkAutoML

How to use

|Binary Classifier|

1: Import the classifier

from SparkAutoML.ml_module.BinaryClassifier_module.classifier_file import BClassifier

2: Setup the experimental

bcf = BClassifier(
    training_data=train, # spark data frame
    hold_out_data=test,  # spark data frame
    target_feature="target",  # target feature
    numeric_features=["num_1","num_3"], # list of all numeric features
    categorical_features=["cat_4", "cat_2"], # list of all categorical features
)

3: Create a model, e.g. random forest classifier

bcf.create_model("rfc")

Compare multiple models

bcf.compare_models(sort='auc')

4: Make predictions

bcf.predict_model(unseen_data # spark data frame)

5: Access the entire pipeline

bcf.fitted_pipeline

Project details

These details have not been verified by PyPI

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 1 - Planning
Intended Audience
- Developers
Operating System
Programming Language
- Python :: 3.8

Release history Release notifications | RSS feed

This version

0.0.15

May 7, 2022

0.0.11

Apr 17, 2022

0.0.10

Apr 17, 2022

0.0.9

Apr 17, 2022

0.0.8

Apr 17, 2022

0.0.7

Apr 16, 2022

0.0.6

Apr 16, 2022

0.0.3

Apr 12, 2022

0.0.2

Apr 12, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SparkAutoML-0.0.15.tar.gz (11.0 kB view hashes)

Uploaded May 7, 2022 Source

Built Distribution

SparkAutoML-0.0.15-py3-none-any.whl (17.2 kB view hashes)

Uploaded May 7, 2022 Python 3

Hashes for SparkAutoML-0.0.15.tar.gz

Hashes for SparkAutoML-0.0.15.tar.gz
Algorithm	Hash digest
SHA256	`8a356e0a35ee18360e5228167d00e3575c620df213a376e1803e43a52b06b9fd`
MD5	`58875d16dca2a370b7735e0571d197a8`
BLAKE2b-256	`ce174999f05ea6480817059d6f8fb7a640be64c43b1cf0958e55ffabdf14af3f`

Hashes for SparkAutoML-0.0.15-py3-none-any.whl

Hashes for SparkAutoML-0.0.15-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8056b131036101107f5cd6c2184fb3d27e924a35fc792cbe5bee2656c82f1e3d`
MD5	`2e8a2937bda7311792fdf13cfd621681`
BLAKE2b-256	`6b457249819c939d4bba013012c1cec917e8e1b5b340edaf32c2b0c89d83ffc4`