An Ordinary Least Square Regressor
Project description
OLS_Regressor
📌 About
The OLS Regression Package is a Python library designed to streamline the process of performing Ordinary Least Squares (OLS) regression analysis. Whether you're a data scientist, researcher, or analyst, this package aims to provide a simple and efficient tool for fitting linear models to your data. It will fit a linear model with coefficients w = (w1, w2, ..., wn) to minimize Residual Sum of Squares (RSS) between the observed targets values in the dataset, and the targets predicted by the linear approximation for the examples in the dataset.
💻 Installation
Install the package from PyPi
Run this command to install the ols_regressor
package from PyPi
pip install ols_regressor
Install the package from GitHub
Run the following commands to install from GitHub if the installation is unsuccessful from PyPi.
Clone the repository Open your terminal, navigate to where you would like the repository to be cloned and run the following command:
$ git clone git@github.com:UBC-MDS/OLS_regressor.git
Create the conda environment and activate it Run the following command to create the conda environment which will include the necessary Python and Poetry versions and dependencies:
conda env create --name ols_regressor python=3.9 poetry==1.7.1 -y
Next, run the following command to activate the conda environment we created:
conda activate `ols_regressor`
Install the package using Poetry
Run the following command to install the package ols_regressor
:
poetry install
💡 Functions
fit
: Fits the linear model according to the OLS mechanism.predict
: Predicts target values using the fitted linear model.score
: Calculates the coefficient of determination R-squared value for the prediction.cross_validate
: Performs cross-validated Ordinary Least Squares (OLS) regression.
⭐ Usage
This guide provides a quick start to using the OLS_Regressor package, specifically the LinearRegressor class, to perform linear regression analysis. The package offers simple-to-use methods for fitting the model, making predictions, and evaluating the performance. For more details about the package, please see the vingette for detailed usage.
Importing the LinearRegressor
from OLS_Regressor.regressor import LinearRegressor
from Ols_regressor.cross_validate import cross_validate
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_regression
Fitting the Model
To fit the linear regression model, you need to have your dataset ready, typically split into features (X) and the target variable (y). Here's how you can fit the model:
# Assuming X and y are your features and target variable respectively
regressor = LinearRegressor()
regressor.fit(X, y)
Making Predictions
Once the model is trained, you can make predictions on new data:
# Assuming X_new represents new data
predictions = regressor.predict(X_new)
Evaluating the Model
To evaluate the performance of your model, you can use the score method, which by default provides the R-squared value of the predictions:
# Evaluating the model on test data
r_squared = regressor.score(X_test, y_test)
print(f"R-squared: {r_squared}")
Cross-Validation
The OLS_Regressor package provides a cross_validate function to help evaluate the model's performance across different partitions of the dataset, ensuring a more robust assessment than using a single train-test split.
To use cross_validate, you must first import it from the package, then provide it with your dataset and the model you wish to evaluate. Here's an example:
# Creating an instance of LinearRegressor
model = LinearRegressor()
# Performing cross-validation
results = cross_validate(model, X, y, cv=5) # cv is the number of folds
# Printing the results
print("Cross-validation results:", results)
🧪 Auto-test
To run the auto-test supported by pytest
, simply run the following command in the terminal or commandline tools:
pytest tests/
✅ OLS_Regressor
use in Python ecosystem
The OLS Regression Package seamlessly integrates into the rich Python ecosystem, offering a specialized solution for Ordinary Least Squares (OLS) regression analysis. While various Python libraries provide general-purpose machine learning and statistical functionalities, our package focuses specifically on the simplicity and efficiency of linear regression. scikit-learn is a widely used machine learning library that encompasses regression among its many capabilities scikit-learn
. Our package distinguishes itself by providing a lightweight and user-friendly interface tailored for users seeking a straightforward solution for OLS regression without the overhead of extensive machine learning or statistical functionalities. If you find that your needs align more closely with a broader set of machine learning tools or comprehensive statistical modeling, scikit-learn or statsmodels may be suitable alternatives. As of [2024-01-12], no existing package caters specifically to OLS regression with our package's emphasis on simplicity and ease of use.
🤝 Contributors
- Xia Yimeng (@YimengXia)
- Sifan Zhang (@Sifanzzz)
- Charles Xu (@charlesxch)
- Waleed Mahmood (@WaleedMahmood1)
🌏 Contributing
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
📗 License
OLS_Regressor
is licensed under the terms of the MIT license.
👏 Credits
OLS_Regressor
was created with cookiecutter
and the py-pkgs-cookiecutter
template.
🖇️ References
Giriyewithana, N. (2023). Australian Vehicle Prices [Data set]. Kaggle. https://www.kaggle.com/datasets/nelgiriyewithana/australian-vehicle-prices
Michael, B. (2023, February 23). How Does Linear Regression Really Work. Towards Data Science. https://towardsdatascience.com/how-does-linear-regression-really-work-2387d0f11e8
scikit-learn: Machine Learning in Python. https://scikit-learn.org/stable/
pandas: A Foundational Python Library for Data Analysis and Statistics. https://pandas.pydata.org/
pytest: helps you write better programs https://pytest.org/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ols_regressor-0.4.4.tar.gz
.
File metadata
- Download URL: ols_regressor-0.4.4.tar.gz
- Upload date:
- Size: 7.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f62dcf150df0851591908c485fedfbb1d30af58f5f5b96c8ded5aa40b6c816ef |
|
MD5 | 273856f4a770442de00fd77f17b7d8cf |
|
BLAKE2b-256 | 12df77df5b3a053c9d630da9098809235bff281dcb16b9ffc3cd5a55c974d3d0 |
File details
Details for the file ols_regressor-0.4.4-py3-none-any.whl
.
File metadata
- Download URL: ols_regressor-0.4.4-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03dc72f035334586276f4ff3b8c6a419587ed5f3a5a1d888b648db68b5f3f451 |
|
MD5 | a1adcaf4e7acfd01d0bb2d9fb1f5d5b4 |
|
BLAKE2b-256 | 30ba4f8a365fe95ee598a1d4ac1badbc5272947e50df26062479ca8891670ea1 |