Skip to main content

For an easy implementation of spark's machine learning library

Project description

SparkAutoML

PyPI PyPI - Python Version GitHub GitHub language countDownloads

This is a TRIAL version

The main idea of the package is to easily build and deploy pyspark's machine learning library

If you want to contribute please reach out to fahadakbar@gmail.com

Thank you !

https://user-images.githubusercontent.com/19522276/163658399-17786b20-0208-44ff-b111-98b74ab34d25.mp4


How to install

PySpark's version 3.2.1 or higher is required

pip install SparkAutoML

How to use

|Binary Classifier|

1: Import the classifier

from SparkAutoML.ml_module.BinaryClassifier_module.classifier_file import BClassifier

2: Setup the experimental

bcf = BClassifier(
    training_data=train, # spark data frame
    hold_out_data=test,  # spark data frame
    target_feature="target",  # target feature
    numeric_features=["num_1","num_3"], # list of all numeric features
    categorical_features=["cat_4", "cat_2"], # list of all categorical features
)

3: Create a model, e.g. random forest classifier

bcf.create_model("rfc") 

OR

Compare multiple models

bcf.compare_models(sort='auc')

4: Make predictions

bcf.predict_model(unseen_data # spark data frame)

5: Access the entire pipeline

bcf.fitted_pipeline


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SparkAutoML-0.0.15.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

SparkAutoML-0.0.15-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file SparkAutoML-0.0.15.tar.gz.

File metadata

  • Download URL: SparkAutoML-0.0.15.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for SparkAutoML-0.0.15.tar.gz
Algorithm Hash digest
SHA256 8a356e0a35ee18360e5228167d00e3575c620df213a376e1803e43a52b06b9fd
MD5 58875d16dca2a370b7735e0571d197a8
BLAKE2b-256 ce174999f05ea6480817059d6f8fb7a640be64c43b1cf0958e55ffabdf14af3f

See more details on using hashes here.

File details

Details for the file SparkAutoML-0.0.15-py3-none-any.whl.

File metadata

  • Download URL: SparkAutoML-0.0.15-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for SparkAutoML-0.0.15-py3-none-any.whl
Algorithm Hash digest
SHA256 8056b131036101107f5cd6c2184fb3d27e924a35fc792cbe5bee2656c82f1e3d
MD5 2e8a2937bda7311792fdf13cfd621681
BLAKE2b-256 6b457249819c939d4bba013012c1cec917e8e1b5b340edaf32c2b0c89d83ffc4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page