Toolkit to forge scikit-learn compatible estimators.
Project description
Scikit-learn Smithy
Scikit-learn smithy is a tool that helps you to forge scikit-learn compatible estimator with ease.
WebUI | Documentation | Repository | Issue Tracker
How can you use it?
✅ Directly from the browser via a Web UI.
- Available at sklearn-smithy.streamlit.app
- It requires no installation.
- Powered by streamlit
✅ As a CLI (command line interface) in the terminal.
- Available via the
smith forge
command. - It requires installation:
python -m pip install sklearn-smithy
- Powered by typer.
✅ As a TUI (terminal user interface) in the terminal.
- Available via the
smith forge-tui
command. - It requires installing extra dependencies:
python -m pip install "sklearn-smithy[textual]"
- Powered by textual.
All these tools will prompt a series of questions regarding the estimator you want to create, and then it will generate the boilerplate code for you.
Why ❓
Writing scikit-learn compatible estimators might be harder than expected.
While everyone knows about the fit
and predict
, there are other behaviours, methods and attributes that
scikit-learn might be expecting from your estimator depending on:
- The type of estimator you're writing.
- The signature of the estimator.
- The signature of the
.fit(...)
method.
Scikit-learn Smithy to the rescue: this tool aims to help you crafting your own estimator by asking a few questions about it, and then generating the boilerplate code.
In this way you will be able to fully focus on the core implementation logic, and not on nitty-gritty details of the scikit-learn API.
Sanity check
Once the core logic is implemented, the estimator should be ready to test against the somewhat official
parametrize_with_checks
pytest compatible decorator:
from sklearn.utils.estimator_checks import parametrize_with_checks
@parametrize_with_checks([
YourAwesomeRegressor,
MoreAwesomeClassifier,
EvenMoreAwesomeTransformer,
])
def test_sklearn_compatible_estimator(estimator, check):
check(estimator)
and it should be compatible with scikit-learn Pipeline, GridSearchCV, etc.
Official guide
Scikit-learn documentation on how to develop estimators.
Supported estimators
The following types of scikit-learn estimator are supported:
- ✅ Classifier
- ✅ Regressor
- ✅ Outlier Detector
- ✅ Clusterer
- ✅ Transformer
- ✅ Feature Selector
- 🚧 Meta Estimator
Installation
sklearn-smithy is available on pypi, so you can install it directly from there:
python -m pip install sklearn-smithy
Remark: The minimum Python version required is 3.10.
This will make the smith
command available in your terminal, and you should be able to run the following:
smith version
sklearn-smithy=...
Extra dependencies
To run the TUI, you need to install the textual
dependency as well:
python -m pip install "sklearn-smithy[textual]"
User guide 📚
Please refer to the dedicated user guide documentation section.
Origin story
The idea for this tool originated from scikit-lego #660, which I cannot better explain than quoting the PR description itself:
So the story goes as the following:
- The CI/CD fails for scikit-learn==1.5rc1 because of a change in the
check_estimator
internals- In the scikit-learn issue I got a better picture of how to run test for compatible components
- In particular, rolling your own estimator suggests to use
parametrize_with_checks
, and of course I thought "that is a great idea to avoid dealing manually with each test"- Say no more, I enter a rabbit hole to refactor all our tests - which would be fine
- Except that these tests failures helped me figure out a few missing parts in the codebase
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sklearn_smithy-0.2.0.tar.gz
.
File metadata
- Download URL: sklearn_smithy-0.2.0.tar.gz
- Upload date:
- Size: 18.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa61505872cfd40ffe8695d711688a2b9d8c5cc7762241f92567f0e187575d07 |
|
MD5 | 80cae5969a57d614f812c6450b0e102c |
|
BLAKE2b-256 | 4ff600e9fca8e50fe7b5c279790d2bd020c2f172b41b7f504ec70efc9eda18ce |
File details
Details for the file sklearn_smithy-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: sklearn_smithy-0.2.0-py3-none-any.whl
- Upload date:
- Size: 24.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1769f7c128d0c43f6143a5ced01cc7298ba943d3fcb13953c14f94e91910ca98 |
|
MD5 | aa549711383a44811ec2b2d8041bd050 |
|
BLAKE2b-256 | f4920b8d6b01fbb639b16560c18506cd1b66c6f168972d34bc614ca7b9cfbd16 |