Skip to main content

A Python package to facilitate Scikit-learn decision tree export to Excel.

Project description

sklearn2excel

Bringing Scikit-learn decision trees to Excel

With this Python package, one can make a trained machine learning model accessible to others without having to deploy it as a service. More specifically, one can export a Scikit-learn decision tree or random forest model to a Excel workbook. All decision chains in the model will be represented within a single table and feature values can be tested for an average prediction.

Project overview

Version: 0.1.1

  • package level
    • export_to_xlsx() (main access point)
    • export_to_textfile() (alternative use)
      • detects maximum tree depth and applies this parameter
  • helpers module
    • create_xlfile (project internal)
      • writes a DecisionTreeTable object to a Excel sheet
      • writes features and an initial value of 1 to front sheet
      • writes decision trees to 2nd sheet
  • core module
    • class DecisionTreeTable (project internal)
      • a class that can be instantiated with a parsed text file
      • transforms and represent decisions trees in a datastructure
      • exposed properties to access info about the structure
      • exposed methods to get tests and results as indexed rows
      • handle classifier- and regressor-type decision trees
  • TODO:
    • thoroughly testing (75%)

Installation

pip install sklearn2excel

Installation will install scikit-learn and XlsxWriter as well.

Usage example

from pathlib import Path
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
import sklear2excel as s2e


# fetch Scikit-learn wine example data as
# sklearn.utils.Bunch object
# and prepare example model from
# sklearn.ensemble.RandomForestClassifier
# RandomForestRegressor or any classifier/regressor
# subtype of BaseDecisionTree could be used
bunch = s2e.get_data_target_and_features()
wine_data = bunch.data
wine_target = bunch.target
wine_features = bunch.feature_names[:4]
X = wine_data[wine_features]
y = LabelEncoder().fit_transform(wine_target)
clf_model = RandomForestClassifier(
  n_estimators=10, 
  min_samples_leaf=2
).fit(X, y)

path_xlsx = Path.cwd() / "excel_output.xlsx"
path_txt = Path.cwd() / "text_output.txt"

# export model as text file with use of 
# sklearn export function
# first param single or ensemble of decision trees
s2e.export_to_textfile(
  clf_model.estimators_,  # ensemble of decision trees
  path_txt,
  wine_features
)

# export model as Excel file
# features written to Front sheet with initial value 1.0
# decision trees written to 2nd sheet
s2e.export_to_xlsx(
  clf_model.estimators_,
  wine_features,
  path_xlsx
)

Development setup

  • Flit ~3.4

Release History

  • 0.1.1
    • FIX: XlsxWriter dependency corrected
  • 0.1.0
    • First proper release
    • NEW: direct function export_to_xlsx()
    • CHANGE: functions and class available at package-level
  • 0.0.1
    • Work in progress

Meta

Torbjørn Wikestad – @TWikestadtorbjorn.wikestad@gmail.com

Distributed under the MIT license. See LICENSE for more information.

https://github.com/tobisan5/github-link

Contributing

  1. Fork it (https://github.com/tobisan5/sklearn2excel/fork)
  2. Create your feature branch (git checkout -b feature/fooBar)
  3. Commit your changes (git commit -am 'Add some fooBar')
  4. Push to the branch (git push origin feature/fooBar)
  5. Create a new Pull Request

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn2excel-0.1.1.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sklearn2excel-0.1.1-py2.py3-none-any.whl (8.3 kB view details)

Uploaded Python 2Python 3

File details

Details for the file sklearn2excel-0.1.1.tar.gz.

File metadata

  • Download URL: sklearn2excel-0.1.1.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.26.0

File hashes

Hashes for sklearn2excel-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c890afb3563870234c7f1ea9d1b4cd91ea36b483e00e5c41cc14c00419732c80
MD5 fa88fae29f9539574148a38a2682e5f5
BLAKE2b-256 149b4f1e031eaeca029891dc17fe6cabb3a29b077645d67a1df6c2df6344231e

See more details on using hashes here.

File details

Details for the file sklearn2excel-0.1.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for sklearn2excel-0.1.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 28070ce008bc4a3e1fc7319259f46394e6bb67092e3b42a794ef9e2655ea63e7
MD5 263cda45d825b752864b64112eb7819c
BLAKE2b-256 2d3246b9a8dd1fb6deb909165018720bb465bd7603ac8c1fe37539a718b038d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page