Skip to main content

Linear Regression from scratch — no ML libraries required.

Project description

linreg

Linear Regression from scratch — no ML libraries required.
Pure Python · Batch Gradient Descent · Z-score normalisation · Early stopping

PyPI version Python versions License: MIT CI


Installation

pip install bizhani-linreg

linreg has a single runtime dependency: matplotlib (for plotting).
The model, scaler, metrics, and data utilities use only the Python standard library.


Quick start

from linreg import (
    LinearRegression,
    ZScoreScaler,
    train_test_split,
    r2_score, rmse, mae,
    plot_all,
)

# 1. Prepare data  (list of lists — no NumPy needed)
X = [[x1, x2], ...]   # shape (n_samples, n_features)
y = [y1, y2, ...]     # shape (n_samples,)

# 2. Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# 3. Z-score normalise
x_scaler  = ZScoreScaler()
X_train_s = x_scaler.fit_transform(X_train)   # fit + transform on train
X_test_s  = x_scaler.transform(X_test)        # transform only on test

y_scaler  = ZScoreScaler()
y_train_s = y_scaler.fit_transform_1d(y_train)

# 4. Train
model = LinearRegression(lr=0.01, max_iter=10_000, tol=1e-6)
model.fit(X_train_s, y_train_s)

# 5. Predict  (invert target scaling to recover original units)
y_pred_s = model.predict(X_test_s)
y_pred   = y_scaler.inverse_transform_1d(y_pred_s)

# 6. Evaluate
print(f"R²   = {r2_score(y_test, y_pred):.4f}")
print(f"RMSE = {rmse(y_test, y_pred):.4f}")
print(f"MAE  = {mae(y_test, y_pred):.4f}")

# 7. Plot
fig = plot_all(model.loss_history, model.n_iter_, y_test, y_pred)
fig.savefig("results.png", dpi=150)

Clone the repo and run the bundled demo:

git clone https://github.com/MohsenBizhani/linreg.git
cd linreg
python main.py

API reference

LinearRegression

LinearRegression(lr=0.01, max_iter=10_000, tol=1e-6, verbose=True)
Method Description
fit(X, y) Train the model; returns self for chaining
predict(X) Return predicted values
score(X, y) Return R² on the given data
summary() Return a string of fitted parameters and training stats

Key attributes after fit()

Attribute Description
weights Fitted weight vector [w₁, …, wₙ]
bias Fitted intercept b
loss_history MSE recorded at every iteration
n_iter_ Actual iterations run (early stopping may fire early)

ZScoreScaler

ZScoreScaler()
Method Description
fit_transform(X) Fit on X, return scaled matrix
transform(X) Apply existing fit to new data
fit_transform_1d(y) Fit on target vector, return scaled list
transform_1d(y) Apply existing fit to new target list
inverse_transform_1d(y_scaled) Recover original units

Metrics

from linreg import mse, rmse, mae, r2_score

All functions accept two plain Python lists: (y_true, y_pred).


Data utilities

from linreg import train_test_split, generate_dataset
Function Description
train_test_split(X, y, test_size, seed) Reproducible random split
generate_dataset(n_samples, noise, seed) Synthetic y = 3x₁ + 5x₂ + 10 + ε

Plotting

from linreg import plot_loss, plot_predictions, plot_residuals, plot_all

All functions return a matplotlib.Figure — they never call plt.show(), so you control display, saving, or notebook embedding.

Function Description
plot_loss(loss_history, n_iter) MSE vs iteration with early-stop marker
plot_predictions(y_true, y_pred) Predicted vs actual scatter
plot_residuals(y_true, y_pred) Residuals vs predicted scatter
plot_all(loss_history, n_iter, y_true, y_pred) All three side-by-side

Algorithm

Model:       ŷ = b + w₁x₁ + w₂x₂ + … + wₙxₙ

Loss (MSE):  L = (1/n) Σ (ŷᵢ − yᵢ)²

Gradients:   ∂L/∂wⱼ = (1/n) Σ (ŷᵢ − yᵢ) · xᵢⱼ
             ∂L/∂b  = (1/n) Σ (ŷᵢ − yᵢ)

Update:      wⱼ ← wⱼ − lr · ∂L/∂wⱼ
             b  ← b  − lr · ∂L/∂b

Stop when:   |MSE(t) − MSE(t−1)| < tol

Project layout

linreg/
├── __init__.py     public API surface
├── _math.py        dot, list_mean, list_std, transpose
├── _scaler.py      ZScoreScaler
├── _model.py       LinearRegression
├── _metrics.py     mse, rmse, mae, r2_score
├── _data.py        train_test_split, generate_dataset
└── plot.py         plot_loss, plot_predictions, plot_residuals, plot_all
tests/
├── test_math.py
├── test_scaler.py
├── test_metrics.py
├── test_data.py
├── test_model.py
└── test_integration.py
main.py             runnable demo
pyproject.toml      packaging
README.md           this file

Development

git clone https://github.com/MohsenBizhani/linreg.git
cd linreg
pip install -e ".[dev]"
pytest

Links


License

MIT © Mohsen Bizhani

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bizhani_linreg-1.0.0.1.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bizhani_linreg-1.0.0.1-py3-none-any.whl (15.1 kB view details)

Uploaded Python 3

File details

Details for the file bizhani_linreg-1.0.0.1.tar.gz.

File metadata

  • Download URL: bizhani_linreg-1.0.0.1.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for bizhani_linreg-1.0.0.1.tar.gz
Algorithm Hash digest
SHA256 c546d5f70f31852b622e32f238510ec11aa819281ffe6d4d8bb388c5fc2f837f
MD5 bf7f3e15e2ff2f7ec24ac20443720515
BLAKE2b-256 fa4eb3365797981695de6ee5fc1ef1d3f9ca1c9e4e774fc504ad57b7596568c7

See more details on using hashes here.

File details

Details for the file bizhani_linreg-1.0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for bizhani_linreg-1.0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8fbe0aa81957e6a0f0065d2794b351bc563d9630f2e6ffc79ca85e47c267ad78
MD5 cdda7ab9e91f1d2c4d632e9d5d5a91cc
BLAKE2b-256 a803841e56bd0199ae323f9592d746ea2832c3cef7cf84c60a3e191f73cf525c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page