Skip to main content

A custom ML regression package built from scratch without Scikit-learn

Project description

myml-regression

A custom machine learning regression package built from scratch without Scikit-learn.

Features

Module Functions / Classes
Regression OLSRegression, RidgeRegression, LassoRegression
Preprocessing fill_missing_mean, fill_missing_mean_df, normalize_features, add_intercept, detect_outliers_zscore
Feature Selection forward_selection, backward_elimination
Diagnostics normality_test (Shapiro-Wilk), vif (Variance Inflation Factor), heteroscedasticity
Visualization plot_actual_vs_predicted, residual_plot
Prediction predict_values

Installation

From PyPI (once published)

pip install myml-regression

From source

git clone https://github.com/yourusername/myml-regression.git
cd myml-regression
pip install .

From local wheel

pip install dist/myml_regression-0.1.0-py3-none-any.whl

Quick Start

import numpy as np
from myml.preprocessing import add_intercept, normalize_features, fill_missing_mean
from myml.regression import OLSRegression, RidgeRegression, LassoRegression
from myml.feature_selection import forward_selection
from myml.diagnostics import normality_test, vif

# --- Sample data ---
X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]], dtype=float)
y = np.array([1.5, 3.5, 5.5, 7.5])

# Preprocessing
X_norm = normalize_features(X)
X_int  = add_intercept(X_norm)

# OLS Regression
ols = OLSRegression()
ols.fit(X_int, y)
print("OLS predictions:", ols.predict(X_int))

# Ridge Regression
ridge = RidgeRegression(alpha=0.5)
ridge.fit(X_int, y)
print("Ridge predictions:", ridge.predict(X_int))

# Lasso Regression
lasso = LassoRegression(alpha=0.01, iterations=1000)
lasso.fit(X_int, y)
print("Lasso predictions:", lasso.predict(X_int))

# Diagnostics
residuals = y - ols.predict(X_int)
stat, p = normality_test(residuals)
print(f"Shapiro-Wilk p-value: {p:.4f}")
vif_scores = vif(X_int)
print("VIF scores:", vif_scores)

Dependencies

  • numpy >= 1.21
  • pandas >= 1.3
  • matplotlib >= 3.4
  • scipy >= 1.7

Publishing to PyPI

# Install build tools
pip install build twine

# Build distributions
python -m build

# Upload to TestPyPI first
twine upload --repository testpypi dist/*

# Upload to PyPI
twine upload dist/*

Project Structure

myml_package/
├── myml/
│   ├── __init__.py
│   ├── regression.py        # OLS, Ridge, Lasso
│   ├── preprocessing.py     # Missing values, normalization, outliers
│   ├── feature_selection.py # Forward selection, backward elimination
│   ├── diagnostics.py       # Shapiro-Wilk, VIF, heteroscedasticity
│   ├── visualization.py     # Actual vs Predicted, Residual plot
│   └── prediction.py        # predict_values helper
├── tests/
│   └── test_myml.py
├── pyproject.toml
├── setup.py
├── setup.cfg
├── MANIFEST.in
├── LICENSE
└── README.md

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

myml_regression-0.1.0.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

myml_regression-0.1.0-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file myml_regression-0.1.0.tar.gz.

File metadata

  • Download URL: myml_regression-0.1.0.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for myml_regression-0.1.0.tar.gz
Algorithm Hash digest
SHA256 dcb0cea99596c2c0a9bca612132e2b257f64e1bb64dcf7f24412b92a03703ea7
MD5 10e962139a112b4bd034eb4f3df7cced
BLAKE2b-256 c56d552c4e72a2b6f09f0af7cb87f423293c6a55c5c7ba1aa4676468706a760e

See more details on using hashes here.

File details

Details for the file myml_regression-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for myml_regression-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 85788d70d4a285e364f3679a54e348d8cd8ac17694be3b5d51a49363b82b9d7c
MD5 f125e7227525022834feae516d399b72
BLAKE2b-256 e53f0b845893a0884e4ba450563275d18f0776290319907c6207f3c0d9fd7e2e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page