Skip to main content

SageML is library for automatic one-shot algorithm selection for tabular data using tda for extracting information about dataset.

Project description

SageML

PyPI - Downloads PyPI Downloads

SageML is an out-of-the-box AutoML solution designed to simplify the machine learning workflow. With minimal user input, SageML automates model selection, hyperparameter optimization, and provides a trained machine learning model ready for deployment.

Table of Contents

Features

  • Automatic Model Selection: Chooses the best algorithm based on data characteristics with pre-trained neural network.
  • Hyperparameter Optimization: Utilizes Optuna for efficient hyperparameter tuning.
  • Data Preprocessing: Handles missing values, categorical encoding, and feature scaling automatically.
  • Interactive Interface: User-friendly terminal interface with tutorials and step-by-step guidance.
  • Extensibility: Modular architecture allows for easy customization and extension.
  • Compatibility: Supports a wide range of algorithms from scikit-learn, CatBoost, XGBoost, and more.

Installation

SageML is available on PyPI. You can install it using pip:

pip install sageml

Note: For the latest features and updates, you might want to install from the GitHub repository.

Quick Start

Here's how you can get started with SageML in just a few lines of code:

from turbo_ml import SageML
import pandas as pd
# Initialize SageML with your dataset
sageML = SageML(pd.read_csv('classified/data.csv'), target='target')

# Make predictions
predictions = sageML.predict(pd.read_csv('not/classified/data.csv'))

Usage

Data Preprocessing

SageML automatically preprocesses your data to make it suitable for machine learning algorithms.

  • Handles missing values with appropriate imputation methods.
  • Encodes categorical variables using techniques like One-Hot Encoding.
  • Scales numerical features for algorithms sensitive to feature scales.

Model Selection

  • Analyzes data characteristics (e.g., number of features, class balance).
  • Selects suitable algorithms from a pool that includes scikit-learn classifiers/regressors, CatBoost, XGBoost, etc.
  • Supports both classification and regression tasks.

Hyperparameter Optimization

  • Utilizes Optuna for efficient hyperparameter optimization.
  • Employs advanced features like pruning to reduce computation time.

Model Evaluation

  • Allows selection of evaluation metrics from scikit-learn or custom weighted sums.
  • Supports cross-validation and hold-out validation strategies.

Documentation

Detailed documentation should be available soon.

Contributing

We welcome contributions from the community!

  • Bug Reports & Feature Requests: Use the GitHub Issues to report bugs or suggest features.

License

SageML is licensed under the GNU General Public License v3.0.


Disclaimer: This project is under active development. Features and interfaces are subject to change.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sageml-0.2.0.tar.gz (49.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sageml-0.2.0-py3-none-any.whl (63.8 kB view details)

Uploaded Python 3

File details

Details for the file sageml-0.2.0.tar.gz.

File metadata

  • Download URL: sageml-0.2.0.tar.gz
  • Upload date:
  • Size: 49.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sageml-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bf1e8606d8288b3d65146ab395de891bed8cbc2d7d90f6ec9d744a61350e0be7
MD5 16e6aa82c67ccbe7405a09f3580c9228
BLAKE2b-256 1dbbb9efdafe9c33c7a1e730be4179c6ca0d650b4ccbce5edfd470ea544741f2

See more details on using hashes here.

File details

Details for the file sageml-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: sageml-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 63.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sageml-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a03b1cc8a3ea086ed63922ccf86ec35433e5cbbb5c0a9a9f984db20cc4de4a5a
MD5 9e963817cff71bf9dd202d08e2f78c2d
BLAKE2b-256 58c6d9a39fc39892fafd5730adc248b98e21bccf27143a1144248d2bd56860a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page