Autogeneration reports for sklearn
Project description
Sklearn Report Generator
This Python library helps generate detailed reports for scikit-learn models. It can create reports in various formats such as HTML, PDF, and DOCX. The library supports model training, metric evaluation, and report generation for both classification and regression tasks.
Installation
To install the library, use pip:
pip install sklearn-report-generator
Usage
After installation, you can use the SklearnReportGenerator to train a model, generate predictions, and create reports.
Example
from reportGeneration.reportGeneration import SklearnReportGenerator
if __name__ == "__main__":
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
data = load_iris()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
report_generator = SklearnReportGenerator(config_file='/your-file-path/config.yaml', output_format="PDF")
report_generator.fit(X_train, y_train, X_test, y_test)
report_generator.predict(X_test)
Configuration Example
The configuration file (config.yaml) defines the transformers, model, selection parameters (optional), and metrics to be used in the report generation.
transformers:
- name: StandardScaler
params: {}
- name: OneHotEncoder
params:
handle_unknown: ignore
model:
name: KNeighborsClassifier
params:
weights: 'distance'
algorithm: 'ball_tree'
leaf_size: 1
selectionParams:
enable: True
name: GridSearchCV
params:
cv: 5
verbose: 3
n_jobs: 20
param_grid:
KNeighborsClassifier__leaf_size: [1, 2, 20]
metrics:
- name: accuracy_score
params: {}
- name: precision_score
params:
average: weighted
zero_division: 0
Configuration Breakdown
- transformers: List of data preprocessing steps. Examples include StandardScaler and OneHotEncoder. Parameters can be customized for each transformer.
- model: Defines the model to be used (e.g., KNeighborsClassifier) and its hyperparameters.
- selectionParams: Optional grid search (or other selection models) for hyperparameter tuning. GridSearchCV is enabled in this example with cross-validation (cv), verbosity, and parallelization (n_jobs).
- metrics: List of metrics used to evaluate the model's performance. For example, accuracy_score and precision_score.
Report Generation
Once the model is trained, a report will be generated in the specified format. The available formats are:
HTML: Generates an HTML report with metrics and graphs.
PDF: Generates a PDF report with metrics and graphs.
DOCX: Generates a DOCX report with metrics and graphs.
The report will be saved in a directory named reports, with a timestamp in the file name to ensure uniqueness.
Example Report Output
The report includes:
- Training time
- Model metrics (e.g., accuracy, precision)
- Visualizations such as confusion matrix (for classification tasks) - there may be bugs in development
- ROC curve (for binary classification tasks) - there may be bugs in development
This version of the library is in testing and will be developed and improved in the future. In subsequent versions it is planned to add new functions, improve data processing and expand reporting capabilities
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sklearn_report_generator-0.2.0.tar.gz.
File metadata
- Download URL: sklearn_report_generator-0.2.0.tar.gz
- Upload date:
- Size: 204.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
225abdda588303d8f6d4d44e7280748089e15cd7a24119f649db23653c36e8cf
|
|
| MD5 |
5dff94ebfad68ff95fcd97e958400d41
|
|
| BLAKE2b-256 |
602555713359fee4fb003e0fb8b64375b654f9b2b0a4525eb42d9af0301ce4f9
|
File details
Details for the file sklearn_report_generator-0.2.0-py3-none-any.whl.
File metadata
- Download URL: sklearn_report_generator-0.2.0-py3-none-any.whl
- Upload date:
- Size: 204.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41c4c117b5a185ee7f949a8cde4d72ca470fe96e8fcd8b1c40d6c8d3db4e7bc6
|
|
| MD5 |
a444473a4260ff3dbd8c9840dc442aa1
|
|
| BLAKE2b-256 |
ec070b59dc3ab36cba4d930765f54fd0438443a309913516040f0b3819df3b40
|