Skip to main content

An advanced machine learning library for model training and selection

Project description

SmartPredict

PyPI version Build Status License: MIT

SmartPredict is an advanced machine learning library designed to simplify model training, evaluation, and selection. It provides a comprehensive set of tools for classification and regression tasks, including automated hyperparameter tuning, feature engineering, ensemble methods, and model explainability.

Table of Contents

Installation

You can install SmartPredict using pip:

pip install smartpredict

Features

  • Unified API for ML Models: Provides a consistent interface for both classification and regression tasks
  • Automated Feature Engineering: Handles missing values, scaling, encoding, feature interactions, and selection
  • Robust Ensemble Methods: Supports voting, averaging, weighted combining, and stacking approaches
  • Hyperparameter Tuning: Uses Optuna for efficient reproducible hyperparameter optimization
  • Model Explainability: Provides SHAP-based explanations and feature importance analysis
  • Comprehensive Error Handling: Gracefully handles common errors during model training and evaluation

Quick Start

Here's a quick example to get you started:

from smartpredict import SmartClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# Load and split data
data = load_breast_cancer()
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create classifier and fit models
clf = SmartClassifier(
    models=['RandomForestClassifier', 'LogisticRegression'], 
    verbose=1
)
results = clf.fit(X_train, X_test, y_train, y_test)

# Display model performance results
print(results)

# Make predictions with all trained models
predictions = clf.predict(X_test)

Usage

Classification

from smartpredict import SmartClassifier

# Create classifier with custom models and parameters
clf = SmartClassifier(
    models=['RandomForestClassifier', 'LogisticRegression', 'SVC'],
    # Pass custom parameters for each model
    RandomForestClassifier={'n_estimators': 200, 'max_depth': 10},
    LogisticRegression={'C': 0.1, 'max_iter': 200},
    verbose=1
)

# Fit and evaluate all models
results = clf.fit(X_train, X_test, y_train, y_test)

# The best model is automatically selected for predictions
predictions = clf.predict(new_data)

Regression

from smartpredict import SmartRegressor

# Create regressor with custom models
reg = SmartRegressor(
    models=['RandomForestRegressor', 'LinearRegression', 'SVR'],
    # Pass custom parameters for a specific model
    RandomForestRegressor={'n_estimators': 200, 'max_depth': 15},
    verbose=1
)

# Fit and evaluate all models
results = reg.fit(X_train, X_test, y_train, y_test)

# The best model is automatically selected for predictions
predictions = reg.predict(new_data)

Advanced Features

Feature Engineering

from smartpredict.feature_engineering import FeatureEngineer

# Create feature engineer
fe = FeatureEngineer(
    scaler='standard',
    encoder='onehot',
    handle_missing='mean',
    create_interactions=True,
    feature_selection=5  # Keep top 5 features
)

# Fit and transform data
X_transformed = fe.fit_transform(X_train)
X_test_transformed = fe.transform(X_test)

Ensemble Methods

from smartpredict.ensemble_methods import EnsembleModel
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

# Create base models
models = [
    ('rf', RandomForestClassifier(n_estimators=100)),
    ('lr', LogisticRegression())
]

# Create ensemble with voting method
ensemble = EnsembleModel(
    models=models,
    method='voting'  # 'voting', 'averaging', 'weighted', or 'stacking'
)

# Fit ensemble
ensemble.fit(X_train, y_train)

# Make predictions
predictions = ensemble.predict(X_test)

Hyperparameter Tuning

from smartpredict.hyperparameter_tuning import tune_hyperparameters
from sklearn.ensemble import RandomForestClassifier

# Create base model
model = RandomForestClassifier()

# Define parameter distributions to search
param_dist = {
    'n_estimators': (50, 300),
    'max_depth': (3, 15),
    'min_samples_split': (2, 10)
}

# Tune hyperparameters
best_model = tune_hyperparameters(
    model=model,
    param_distributions=param_dist,
    X=X_train,
    y=y_train,
    n_trials=100,
    scoring='f1',
    random_state=42
)

# Use the optimized model
predictions = best_model.predict(X_test)

Explainability

from smartpredict.explainability import ModelExplainer

# Create explainer
explainer = ModelExplainer(
    model=trained_model,
    feature_names=feature_names
)

# Set training data (needed for some explanation methods)
explainer.set_training_data(X_train, y_train)

# Get feature importance
importance_df = explainer.get_feature_importance()
print(importance_df)

# Explain a prediction
explanation = explainer.explain_prediction(X_test[0])
print(explanation)

Contributing

We welcome contributions! Please feel free to submit a Pull Request.

License

SmartPredict is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartpredict-0.1.0.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smartpredict-0.1.0-py3-none-any.whl (23.8 kB view details)

Uploaded Python 3

File details

Details for the file smartpredict-0.1.0.tar.gz.

File metadata

  • Download URL: smartpredict-0.1.0.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for smartpredict-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c094b3b2acecc207aa9f7ad6dd0a3b1c980401598215ad9dfbeb32072d92fe7e
MD5 7e44ee7d800d88dde5e05353441948f6
BLAKE2b-256 cc655ebe765bb1fa053d6a18a927c4da6b9a69fc124ae8ec65988e3484e5e84f

See more details on using hashes here.

File details

Details for the file smartpredict-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: smartpredict-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for smartpredict-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4187cb68000cdecb55d31faa248478faf5ecb2d8708ab28ff5513881f3baad89
MD5 9c2af5d15410c34249dbb89473c5fb0f
BLAKE2b-256 3c80042134e1229b1989b2bf75ae5af52646b917675f0665ecd81e05c20894b2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page