For an easy implementation of spark's machine learning library
Project description
SparkAutoML
This is a TRIAL version
The main idea of the package is to easily build and deploy pyspark's machine learning library
If you want to contribute please reach out to fahadakbar@gmail.com
Thank you !
How to install
PySpark's version 3.2.1 or higher is required
pip install SparkAutoML
How to use
|Binary Classifier|
1: Import the classifier
from SparkAutoML.ml_module.BinaryClassifier_module.classifier_file import BClassifier
2: Setup the experimental
bcf = BClassifier(
training_data=train, # spark data frame
hold_out_data=test, # spark data frame
target_feature="target", # target feature
numeric_features=["num_1","num_3"], # list of all numeric features
categorical_features=["cat_4", "cat_2"], # list of all categorical features
)
3: Create a model, e.g. random forest classifier
bcf.create_model("rfc")
OR
Compare multiple models
bcf.compare_models(sort='auc')
4: Make predictions
bcf.predict_model(unseen_data # spark data frame)
5: Access the entire pipeline
bcf.fitted_pipeline
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
SparkAutoML-0.0.11.tar.gz
(8.7 kB
view hashes)
Built Distribution
Close
Hashes for SparkAutoML-0.0.11-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1f4aa19a11babe58a30037b71e0d75a4fc4855167e222e0560f83df930f0ccf |
|
MD5 | 0faff517f3217d10e95ea0e540615b79 |
|
BLAKE2b-256 | 29c9db8904977f60d0e06bde96afd88b28b5e44d9a0fad4cc8f25e5aea96b3c3 |