A modular Python library for automated data preprocessing, feature selection, model evaluation, and report generation.
Project description
OCTOPY
A Modular Python Library for Machine Learning Automation
Author : Sahil Kewat Version: 1.0.0 License: MIT Language: Python 3.12+
- OVERVIEW
Octopy is a modular machine learning support library that automates data preprocessing, feature selection, model evaluation, and report generation. It is designed to simplify the ML workflow by providing a collection of plug-and-play Python modules.
The library is built for developers, data scientists, and researchers who want a fast, reproducible, and well-structured way to prepare, train, and analyze ML models.
- MODULES INCLUDED
Octopy contains the following main Python modules:
-
pipeline.py - Handles the creation and execution of ML pipelines.
-
prep.py - Cleans and preprocesses raw datasets.
-
selector.py - Performs feature selection and ranking.
-
smart_eda.py - Generates visual and statistical exploratory data analysis.
-
report.py - Loads a trained model, evaluates it, and creates JSON reports.
-
MODULE DETAILS AND FUNCTIONS
A. pipeline.py
Purpose: Automates machine learning pipeline creation from preprocessing to model training and saving.
Key Functions: • build_pipeline(model, preprocess_steps) - Combines preprocessing and model into a single pipeline. • train_pipeline(pipeline, X_train, y_train) - Fits the pipeline to training data. • save_pipeline(pipeline, filename) - Saves the pipeline to disk using pickle/joblib. • load_pipeline(filename) - Loads an existing pipeline for inference or retraining.
B. prep.py
Purpose: Cleans, encodes, and scales data for model training.
Key Functions: • handle_missing_values(df) - Fills or removes missing values automatically. • encode_categorical(df) - Converts categorical variables into numeric form. • scale_features(df) - Applies standard or min-max scaling to numeric features. • preprocess_data(df) - Combines all preprocessing operations into one function.
C. selector.py
Purpose: Selects the most important features for model training.
Key Functions: • select_k_best_features(X, y, k) - Selects top k features based on statistical tests. • feature_importance(model, X, y) - Displays or returns feature importance scores. • recursive_feature_elimination(model, X, y) - Uses RFE to iteratively eliminate less important features.
D. smart_eda.py
Purpose: Automates exploratory data analysis (EDA) and visualization.
Key Functions: • describe_data(df) - Provides summary statistics of the dataset. • plot_distributions(df) - Plots histograms and distribution graphs for numeric features. • correlation_heatmap(df) - Displays correlation between numeric variables. • detect_outliers(df) - Identifies outliers using z-score or IQR method.
E. report.py
Purpose: Evaluates trained ML models and generates automated reports.
Key Functions: • load_model(model_path) - Loads a trained model from .pkl, .sav, or joblib files. • load_test_data(x_path, y_path) - Loads X_test and y_test from CSV files. • evaluate_model(model, X_test, y_test) - Computes MAE, MSE, RMSE, and R² metrics. • extract_hyperparameters(model) - Extracts key hyperparameters from scikit-learn compatible models. • generate_report(model_path, x_test_path=None, y_test_path=None) - Generates a structured JSON report containing model name, hyperparameters, and evaluation metrics.
-
INSTALLATION
-
Clone or download the repository: got clone https://github.com/Sahilkewat80085/OctoPy.git cd Octopy
-
Install dependencies and package: pip install -e .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file octopyx-1.0.0.tar.gz.
File metadata
- Download URL: octopyx-1.0.0.tar.gz
- Upload date:
- Size: 11.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df26e0636e34bb5148b74c4291456f06ff0619c26758554b494cc33b2102797a
|
|
| MD5 |
9a41c65f067f89265ea77d5d660bc31c
|
|
| BLAKE2b-256 |
972c370f86bd08f27803315fb3cff4de185842ea7f880781274fcda509e11c6a
|
File details
Details for the file octopyx-1.0.0-py3-none-any.whl.
File metadata
- Download URL: octopyx-1.0.0-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
958666ebbeef4f06abcf9f51be03d21b82caf405dd652baef883dd1950717e66
|
|
| MD5 |
5826e66aba691662cfca73b87816c9bb
|
|
| BLAKE2b-256 |
e2b493b6915afc15cfbc36eef3bfdf4a12c5fbc353de376130aa0256bd853d82
|