A simple package for automatic logistic regression
Project description
Auto Logistic Regression
Overview
This project implements an Auto Logistic Regression framework for easy model training, evaluation, and prediction. The framework includes two main classes: AutoLogisticRegression
and AutoPreprocessor
.
AutoLogisticRegression
Class
The AutoLogisticRegression
class is designed to automate the process of logistic regression model training and evaluation. Key functionalities include:
- Initialization: Accepts the path to the training dataset (
data_path
), the target column name (target_column
), the output path for the model and predictions (output_path
), and an optional parameter for the number of folds in cross-validation (num_folds
). - Training: Uses logistic regression with cross-validation to train a model on the provided dataset. The best model is saved as a pickle file for future use.
- Prediction: Given a new dataset, the trained model can be used to make predictions. The predictions are saved to a CSV file at the specified output path.
AutoPreprocessor
Class
The AutoPreprocessor
class handles the preprocessing steps required before training or making predictions with the logistic regression model. Key functionalities include:
- Initialization: Accepts optional parameters for specifying categorical columns (
categorical_columns
), numeric columns (numeric_columns
), and whether to perform oversampling (oversample
). - Fit and Transform: Fits an imputer, encoder (for categorical columns), and scaler (for numeric columns) on the provided data. The transformations are then applied to the data.
- Oversampling: Optionally applies oversampling to balance the class distribution in the training data.
- Split Data: Splits the data into training and testing sets.
Usage
- Importing:
from autolr import AutoLogisticRegression
- Initialization: Create an instance of the
AutoLogisticRegression
class by providing the path to the training dataset, the target column name, and the output path for the model and predictions. Optionally, you can specify the number of folds for cross-validation.
auto_lr = AutoLogisticRegression(data_path='path/to/training_data.csv', target_column='target', output_path='output', num_folds=5)
- Training Model: After training, the model is automatically evaluated using metrics such as accuracy, classification report, and confusion matrix. The feature importance is also displayed.
# Training the model and evaluating it
auto_lr.train()
- Prediction: After training, you can use the trained model to make predictions on new data by providing the path to the new dataset.
predictions = auto_lr.predict(data_path='path/to/test_data.csv')
Dependencies
Notes
- Ensure that the required dependencies are installed before running the code.
- The
AutoPreprocessor
class is used internally for data preprocessing and is not intended for standalone use. - Customize and extend this framework based on your specific needs.
- If you encounter any issues or have suggestions for improvements, please let us know.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file autolr-0.0.2.tar.gz
.
File metadata
- Download URL: autolr-0.0.2.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b646ec4e9d2b4095dd8c718744c43f10ca815caa14742306b8f2f85a4afa1eba |
|
MD5 | 0f583c93f4d2db94b63f75f738d48e4b |
|
BLAKE2b-256 | f5852f2c1b6cd45afe5aea510f4bc846c1ea520eacd77030ff5ef1498d23b09c |
File details
Details for the file autolr-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: autolr-0.0.2-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4f90a33fd19d461d31c021c94e4506203d0936c7df8c662319fdeb2140f420e |
|
MD5 | e00655697210a8929f8b5a8b3e7a2900 |
|
BLAKE2b-256 | 39479830c425f265b9bbb1a18968bcb3d84e027f508756e74882f62bc322b19a |