Skip to main content

A lightweight Python package for Machine Learning utilities

Project description

Package: batabyal


batabyal is a lightweight Python package for Machine Learning utilities that provides:

  • cleaning_module - A CSV data cleaning module
  • trainer_kit - ML module for classification problems

Installation


Use the below command in the terminal

pip install batabyal

Importation


Import a specific thing or the entire module whatever is required

from batabyal import cleaning_module as cm
from batabyal.trainer_kit import TransformedTargetClassifier, autofit_classification_model

Usage


1. cleaning_module: It provides only one function clean_csv used for cleaning .csv datasets efficiently

cm.clean_csv('filename.csv', numericData, charData, True, True) 
#structure: clean_csv(file, numericData, charData, fill, case_sensitivity=False, dummies=None) -> pd.DataFrame
#If `fill==True`, it fills NaN in numeric columns with its mean. 
#if `case_sensitivity=True`, it will lowercase all labelled values.
#`dummies` are the list of values to replace with NaN before cleaning.

2. trainer_kit: It provides one wrapper class TransformedTargetClassifier for encoding and inversely transforming predictions to the original label and one function autofit_classification_model for autofitting classification models with the best algorithm and hyperparameters based on roc_auc_ovr_weighted score

model = TransformedTargetClassifier(classifier=svc, transformer=labelEncoder)
#let labelencoder and svc are from sklearn 
#you can now use model.fit() , model.predict() with raw labelled data, it will automate the encoding internally for training and prediction
#And model.predict() will return the original label by inversely transforming the encoded numbers back internally 

result = autofit_classification_model(x, y, "numeric", 3)
#structure: autofit_classification_model(x:pd.DataFrame, y:pd.DataFrame, x_type:Literal["numeric", "categorical", "mixed"], n_splits:int, cat_features:list[str]=[], whitelisted_algorithms:list[Literal["LogisticRegression", "DecisionTree", "RandomForest", "GaussianNB", "BernoulliNB", "CategoricalNB", "CatBoost", "XGBoost", "Ripper", "SVC", "KNN"]]|Literal["auto"]="auto", enable_votingClassifier:bool=True, random_state:int|None=42, verbosity:bool=True) -> object

model = result.model #now use model.predict
score = result.score #print score
classifier = result.classifier #print classifier to know the best algorithm name that's used
convertible_model = result.convertible_model #extracts the model only (no preprocessing)
preprocessedX = result.preprocessedX #extracts the x features after preprocessing
n_features = result.n_features #returns total number of the preprocessed x features
initial_type = result.initial_type #initial type needed to convert the model to .onnx
result.export_to_onnx() #dump the model as 'model.onnx' in your current working directory (Input name: 'input')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

batabyal-1.2.1.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

batabyal-1.2.1-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file batabyal-1.2.1.tar.gz.

File metadata

  • Download URL: batabyal-1.2.1.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for batabyal-1.2.1.tar.gz
Algorithm Hash digest
SHA256 dc1e4bba15e47217487438eb576fa0436fc3c6479ac7dd8f066aced1894f84cd
MD5 f163c2d09a02a4a4e6afe1e5857c8ee0
BLAKE2b-256 ab110e3c3c1aac513afa73e8f53c01906a4f2426909f775f2277645a6fa79c45

See more details on using hashes here.

File details

Details for the file batabyal-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: batabyal-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for batabyal-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1eb2087f0504664bad8f743c7b8e34c87855f2b444b4ace18811e77d5d5e2ea8
MD5 5c3988ab13c2f574d10c23a4694fa318
BLAKE2b-256 93f22cf0fc7d38621e2a5d206d91c47f02b0692d12c2e2bef017b8ea13440ce4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page