Skip to main content

An Ordinary Least Square Regressor

Project description

OLS_Regressor

Documentation Status License: MIT Python 3.9.0 PyPI ci-cd Project Status: Active – The project has reached a stable, usable state and is being actively developed. version release

📌 About

The OLS Regression Package is a Python library designed to streamline the process of performing Ordinary Least Squares (OLS) regression analysis. Whether you're a data scientist, researcher, or analyst, this package aims to provide a simple and efficient tool for fitting linear models to your data. It will fit a linear model with coefficients w = (w1, w2, ..., wn) to minimize Residual Sum of Squares (RSS) between the observed targets values in the dataset, and the targets predicted by the linear approximation for the examples in the dataset.

💻 Installation

Install the package from PyPi

Run this command to install the ols_regressor package from PyPi

pip install ols_regressor

Install the package from GitHub

Run the following commands to install from GitHub if the installation is unsuccessful from PyPi.

Clone the repository Open your terminal, navigate to where you would like the repository to be cloned and run the following command:

$ git clone git@github.com:UBC-MDS/OLS_regressor.git

Create the conda environment and activate it Run the following command to create the conda environment which will include the necessary Python and Poetry versions and dependencies:

conda env create --name ols_regressor python=3.9 poetry==1.7.1 -y

Next, run the following command to activate the conda environment we created:

conda activate `ols_regressor`

Install the package using Poetry Run the following command to install the package ols_regressor:

poetry install

💡 Functions

  • fit: Fits the linear model according to the OLS mechanism.
  • predict: Predicts target values using the fitted linear model.
  • score: Calculates the coefficient of determination R-squared value for the prediction.
  • cross_validate: Performs cross-validated Ordinary Least Squares (OLS) regression.

⭐ Usage

This guide provides a quick start to using the OLS_Regressor package, specifically the LinearRegressor class, to perform linear regression analysis. The package offers simple-to-use methods for fitting the model, making predictions, and evaluating the performance. For more details about the package, please see the vingette for detailed usage.

Importing the LinearRegressor

from OLS_Regressor.regressor import LinearRegressor
from Ols_regressor.cross_validate import cross_validate
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_regression

Fitting the Model

To fit the linear regression model, you need to have your dataset ready, typically split into features (X) and the target variable (y). Here's how you can fit the model:

# Assuming X and y are your features and target variable respectively
regressor = LinearRegressor()
regressor.fit(X, y)

Making Predictions

Once the model is trained, you can make predictions on new data:

# Assuming X_new represents new data
predictions = regressor.predict(X_new)

Evaluating the Model

To evaluate the performance of your model, you can use the score method, which by default provides the R-squared value of the predictions:

# Evaluating the model on test data
r_squared = regressor.score(X_test, y_test)
print(f"R-squared: {r_squared}")

Cross-Validation

The OLS_Regressor package provides a cross_validate function to help evaluate the model's performance across different partitions of the dataset, ensuring a more robust assessment than using a single train-test split.

To use cross_validate, you must first import it from the package, then provide it with your dataset and the model you wish to evaluate. Here's an example:

# Creating an instance of LinearRegressor
model = LinearRegressor()

# Performing cross-validation
results = cross_validate(model, X, y, cv=5)  # cv is the number of folds

# Printing the results
print("Cross-validation results:", results)

🧪 Auto-test

To run the auto-test supported by pytest, simply run the following command in the terminal or commandline tools:

pytest tests/

OLS_Regressor use in Python ecosystem

The OLS Regression Package seamlessly integrates into the rich Python ecosystem, offering a specialized solution for Ordinary Least Squares (OLS) regression analysis. While various Python libraries provide general-purpose machine learning and statistical functionalities, our package focuses specifically on the simplicity and efficiency of linear regression. scikit-learn is a widely used machine learning library that encompasses regression among its many capabilities scikit-learn. Our package distinguishes itself by providing a lightweight and user-friendly interface tailored for users seeking a straightforward solution for OLS regression without the overhead of extensive machine learning or statistical functionalities. If you find that your needs align more closely with a broader set of machine learning tools or comprehensive statistical modeling, scikit-learn or statsmodels may be suitable alternatives. As of [2024-01-12], no existing package caters specifically to OLS regression with our package's emphasis on simplicity and ease of use.

🤝 Contributors

  • Xia Yimeng (@YimengXia)
  • Sifan Zhang (@Sifanzzz)
  • Charles Xu (@charlesxch)
  • Waleed Mahmood (@WaleedMahmood1)

🌏 Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

📗 License

OLS_Regressor is licensed under the terms of the MIT license.

👏 Credits

OLS_Regressor was created with cookiecutter and the py-pkgs-cookiecutter template.

🖇️ References

Giriyewithana, N. (2023). Australian Vehicle Prices [Data set]. Kaggle. https://www.kaggle.com/datasets/nelgiriyewithana/australian-vehicle-prices

Michael, B. (2023, February 23). How Does Linear Regression Really Work. Towards Data Science. https://towardsdatascience.com/how-does-linear-regression-really-work-2387d0f11e8

scikit-learn: Machine Learning in Python. https://scikit-learn.org/stable/

pandas: A Foundational Python Library for Data Analysis and Statistics. https://pandas.pydata.org/

pytest: helps you write better programs https://pytest.org/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ols_regressor-0.4.4.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

ols_regressor-0.4.4-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file ols_regressor-0.4.4.tar.gz.

File metadata

  • Download URL: ols_regressor-0.4.4.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for ols_regressor-0.4.4.tar.gz
Algorithm Hash digest
SHA256 f62dcf150df0851591908c485fedfbb1d30af58f5f5b96c8ded5aa40b6c816ef
MD5 273856f4a770442de00fd77f17b7d8cf
BLAKE2b-256 12df77df5b3a053c9d630da9098809235bff281dcb16b9ffc3cd5a55c974d3d0

See more details on using hashes here.

File details

Details for the file ols_regressor-0.4.4-py3-none-any.whl.

File metadata

File hashes

Hashes for ols_regressor-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 03dc72f035334586276f4ff3b8c6a419587ed5f3a5a1d888b648db68b5f3f451
MD5 a1adcaf4e7acfd01d0bb2d9fb1f5d5b4
BLAKE2b-256 30ba4f8a365fe95ee598a1d4ac1badbc5272947e50df26062479ca8891670ea1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page