Skip to main content

Autogeneration reports for sklearn

Project description

Sklearn Report Generator

This Python library helps generate detailed reports for scikit-learn models. It can create reports in various formats such as HTML, PDF, and DOCX. The library supports model training, metric evaluation, and report generation for both classification and regression tasks.

Installation

To install the library, use pip:

  pip install sklearn-report-generator

Usage

After installation, you can use the SklearnReportGenerator to train a model, generate predictions, and create reports.

Example

from reportGeneration.reportGeneration import SklearnReportGenerator

if __name__ == "__main__":
    from sklearn.datasets import load_iris
    from sklearn.model_selection import train_test_split
    data = load_iris()
    X = data.data
    y = data.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    report_generator = SklearnReportGenerator(config_file='/your-file-path/config.yaml', output_format="PDF")
    report_generator.fit(X_train, y_train, X_test, y_test)
    report_generator.predict(X_test)

Configuration Example

The configuration file (config.yaml) defines the transformers, model, selection parameters (optional), and metrics to be used in the report generation.

transformers:
  - name: StandardScaler
    params: {}
  - name: OneHotEncoder
    params:
      handle_unknown: ignore
model:
  name: KNeighborsClassifier
  params:
    weights: 'distance'
    algorithm: 'ball_tree'
    leaf_size: 1
selectionParams:
  enable: True
  name: GridSearchCV
  params:
    cv: 5
    verbose: 3
    n_jobs: 20
  param_grid:
    KNeighborsClassifier__leaf_size: [1, 2, 20]
metrics:
  - name: accuracy_score
    params: {}
  - name: precision_score
    params:
      average: weighted
      zero_division: 0

Configuration Breakdown

  1. transformers: List of data preprocessing steps. Examples include StandardScaler and OneHotEncoder. Parameters can be customized for each transformer.
  2. model: Defines the model to be used (e.g., KNeighborsClassifier) and its hyperparameters.
  3. selectionParams: Optional grid search (or other selection models) for hyperparameter tuning. GridSearchCV is enabled in this example with cross-validation (cv), verbosity, and parallelization (n_jobs).
  4. metrics: List of metrics used to evaluate the model's performance. For example, accuracy_score and precision_score.

Report Generation

Once the model is trained, a report will be generated in the specified format. The available formats are:

HTML: Generates an HTML report with metrics and graphs.
PDF: Generates a PDF report with metrics and graphs.
DOCX: Generates a DOCX report with metrics and graphs.

The report will be saved in a directory named reports, with a timestamp in the file name to ensure uniqueness.

Example Report Output

The report includes:

  • Training time
  • Model metrics (e.g., accuracy, precision)
  • Visualizations such as confusion matrix (for classification tasks) - there may be bugs in development
  • ROC curve (for binary classification tasks) - there may be bugs in development

This version of the library is in testing and will be developed and improved in the future. In subsequent versions it is planned to add new functions, improve data processing and expand reporting capabilities

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn_report_generator-0.2.0.tar.gz (204.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sklearn_report_generator-0.2.0-py3-none-any.whl (204.5 kB view details)

Uploaded Python 3

File details

Details for the file sklearn_report_generator-0.2.0.tar.gz.

File metadata

  • Download URL: sklearn_report_generator-0.2.0.tar.gz
  • Upload date:
  • Size: 204.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for sklearn_report_generator-0.2.0.tar.gz
Algorithm Hash digest
SHA256 225abdda588303d8f6d4d44e7280748089e15cd7a24119f649db23653c36e8cf
MD5 5dff94ebfad68ff95fcd97e958400d41
BLAKE2b-256 602555713359fee4fb003e0fb8b64375b654f9b2b0a4525eb42d9af0301ce4f9

See more details on using hashes here.

File details

Details for the file sklearn_report_generator-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sklearn_report_generator-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41c4c117b5a185ee7f949a8cde4d72ca470fe96e8fcd8b1c40d6c8d3db4e7bc6
MD5 a444473a4260ff3dbd8c9840dc442aa1
BLAKE2b-256 ec070b59dc3ab36cba4d930765f54fd0438443a309913516040f0b3819df3b40

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page