Skip to main content

Linear Regression from scratch — no ML libraries required.

Project description

linreg

Linear Regression from scratch — no ML libraries required.
Pure Python · Batch Gradient Descent · Z-score normalisation · Early stopping


Installation

# Editable (development) install — changes to the source are reflected immediately
pip install -e .

# Or a regular install
pip install .

linreg has a single runtime dependency: matplotlib (for plotting).
The model, scaler, metrics, and data utilities use only the Python standard library.


Quick start

from linreg import (
    LinearRegression,
    ZScoreScaler,
    train_test_split,
    r2_score, rmse, mae,
    plot_all,
)

# 1. Prepare data  (list of lists — no NumPy needed)
X = [[x1, x2], ...]   # shape (n_samples, n_features)
y = [y1, y2, ...]     # shape (n_samples,)

# 2. Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# 3. Z-score normalise
x_scaler  = ZScoreScaler()
X_train_s = x_scaler.fit_transform(X_train)   # fit + transform on train
X_test_s  = x_scaler.transform(X_test)        # transform only on test

y_scaler  = ZScoreScaler()
y_train_s = y_scaler.fit_transform_1d(y_train)
y_test_s  = y_scaler.transform_1d(y_test)

# 4. Train
model = LinearRegression(lr=0.01, max_iter=10_000, tol=1e-6)
model.fit(X_train_s, y_train_s)

# 5. Predict  (invert target scaling to recover original units)
y_pred_s = model.predict(X_test_s)
y_pred   = y_scaler.inverse_transform_1d(y_pred_s)

# 6. Evaluate
print(f"R²   = {r2_score(y_test, y_pred):.4f}")
print(f"RMSE = {rmse(y_test, y_pred):.4f}")
print(f"MAE  = {mae(y_test, y_pred):.4f}")

# 7. Plot
fig = plot_all(model.loss_history, model.n_iter_, y_test, y_pred)
fig.savefig("results.png", dpi=150)

Run the bundled demo:

python main.py

API reference

LinearRegression

LinearRegression(lr=0.01, max_iter=10_000, tol=1e-6, verbose=True)
Method Description
fit(X, y) Train the model; returns self for chaining
predict(X) Return predicted values
score(X, y) Return R² on the given data
summary() Print fitted parameters and training stats

Key attributes after fit()

Attribute Description
weights Fitted weight vector [w₁, …, wₙ]
bias Fitted intercept b
loss_history MSE at every iteration
n_iter_ Actual iterations run (early stopping may fire early)

ZScoreScaler

ZScoreScaler()
Method Description
fit_transform(X) Fit on X, return scaled matrix
transform(X) Apply existing fit to new data
fit_transform_1d(y) Fit on target vector, return scaled list
transform_1d(y) Apply existing fit to new target list
inverse_transform_1d(y_scaled) Recover original units

Metrics

from linreg import mse, rmse, mae, r2_score

All functions accept two plain Python lists: (y_true, y_pred).


Data utilities

from linreg import train_test_split, generate_dataset
Function Description
train_test_split(X, y, test_size, seed) Reproducible random split
generate_dataset(n_samples, noise, seed) Synthetic y = 3x₁ + 5x₂ + 10 + ε

Plotting

from linreg import plot_loss, plot_predictions, plot_residuals, plot_all

All functions return a matplotlib.Figure — they never call plt.show(), so you control display, saving, or embedding.

Function Description
plot_loss(loss_history, n_iter) MSE vs iteration with early-stop marker
plot_predictions(y_true, y_pred) Predicted vs actual scatter
plot_residuals(y_true, y_pred) Residuals vs predicted scatter
plot_all(loss_history, n_iter, y_true, y_pred) All three side-by-side

Algorithm

Model:       ŷ = b + w₁x₁ + w₂x₂ + … + wₙxₙ

Loss (MSE):  L = (1/n) Σ (ŷᵢ − yᵢ)²

Gradients:   ∂L/∂wⱼ = (1/n) Σ (ŷᵢ − yᵢ) · xᵢⱼ
             ∂L/∂b  = (1/n) Σ (ŷᵢ − yᵢ)

Update:      wⱼ ← wⱼ − lr · ∂L/∂wⱼ
             b  ← b  − lr · ∂L/∂b

Stop when:   |MSE(t) − MSE(t−1)| < tol

Project layout

linreg/
├── __init__.py     public API surface
├── _math.py        dot, list_mean, list_std, transpose
├── _scaler.py      ZScoreScaler
├── _model.py       LinearRegression
├── _metrics.py     mse, rmse, mae, r2_score
├── _data.py        train_test_split, generate_dataset
└── plot.py         plot_loss, plot_predictions, plot_residuals, plot_all
main.py             runnable demo
setup.py            packaging
README.md           this file

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bizhani_linreg-1.0.0.0.tar.gz (20.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bizhani_linreg-1.0.0.0-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

bizhani_linreg-1.0.0-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file bizhani_linreg-1.0.0.0.tar.gz.

File metadata

  • Download URL: bizhani_linreg-1.0.0.0.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for bizhani_linreg-1.0.0.0.tar.gz
Algorithm Hash digest
SHA256 82e6eeaa884085c8af81f0777aac5fa45ccf3c555c333d961fe300eb9d99e7eb
MD5 4c215c2c197aeacc0245cd68ef5e839d
BLAKE2b-256 b600bd0cecd380dc4a73043f5d901d36492d1c68febc6cd23bcb964e8bfbc92b

See more details on using hashes here.

File details

Details for the file bizhani_linreg-1.0.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for bizhani_linreg-1.0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 64ccae85d6df48ea3bbf033f44499462d7572cf0d02fc2a541df3fb20b913faa
MD5 81bc5d4c51d6a694459d32f4e453d31f
BLAKE2b-256 27f1e083c0eeaec69597e575c406795660b7f5769b6835b435a3b8e9346cb682

See more details on using hashes here.

File details

Details for the file bizhani_linreg-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: bizhani_linreg-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 14.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for bizhani_linreg-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5abcf84a03ea87900557b17e7489d0a6787ed05d3f5c78624696284480b3d3bc
MD5 7d0a9e2a0f7d586d1a5eafd3ca351acc
BLAKE2b-256 eaa76b2d61944e71d6a101d14771e5b0ceb2715ae695f20d42f8f6d858b4d996

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page