scikit-ext

Various scikit-learn extensions

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python
Topic
- Scientific/Engineering :: Artificial Intelligence

Project description

About

The scikit_ext package contains various scikit-learn extensions, built entirely on top of sklearn base classes. The package is separated into two modules: estimators and scorers. Full documentation can be found here.

Installation

Package Index on PyPI To install:

pip install scikit-ext

Estimators

MultiGridSearchCV: Extension to native sklearn GridSearchCV for multiple estimators and param_grids. Accepts a list of estimators and param_grids, iterating through each fitting a GridSearchCV model for each estimator/param_grid. Chooses the best fitted GridSearchCV model. Inherits sklearn’s BaseSearchCV class, so attributes and methods are all similar to GridSearchCV.
PrunedPipeline: Extension to native sklearn Pipeline intended for text learning pipelines with a vectorization step and a feature selection step. Instead of remembering all vectorizer vocabulary elements and selecting appropriate features at prediction time, the extension prunes the vocabulary after fitting to only include elements who will ultimately survive the feature selection filter applied later in the pipeline. This reduces memory and improves prediction latency. Predictions will be identical to those made with a trained Pipeline model. Inherits sklearn’s Pipeline class, so attributes and methods are all similar to Pipeline.
ZoomGridSearchCV: Extension to native sklearn GridSearchCV. Fits multiple GridSearchCV models, updating the param_grid after each iteration. The update looks at successful parameter values for each grid key. A new list of values is created which expands the resolution of the search values centered around the best performing value of the previous fit. This allows the standard grid search process to start with a small number of distant values for each parameter, and zoom in as the better performing corner of the hyperparameter search space becomes clear.
IterRandomEstimator: Meta-Estimator intended primarily for unsupervised estimators whose fitted model can be heavily dependent on an arbitrary random initialization state. It is best used for problems where a fit_predict method is intended, so the only data used for prediction will be the same data on which the model was fitted.
OptimizedEnsemble: An optimized ensemble class. Will find the optimal n_estimators parameter for the given ensemble estimator, according to the specified input parameters.
OneVsRestAdjClassifier: One-Vs-Rest multiclass strategy. The adjusted version is a custom extension which overwrites the inherited predict_proba method with a more flexible method allowing custom normalization for the predicted probabilities. Any norm argument that can be passed directly to sklearn.preprocessing.normalize is allowed. Additionally, norm=None will skip the normalization step alltogeter. To mimick the inherited OneVsRestClassfier behavior, set norm=’l2’. All other methods are inherited from OneVsRestClassifier.

Scorers

TimeScorer: Score using estimated prediction latency of estimator.
MemoryScorer: Score using estimated memory of pickled estimator object.
CombinedScorer: Score combining multiple scorers by averaging their scores.
cluster_distribution_score: Scoring function which scores the resulting cluster distribution accross classes. A more even distribution indicates a higher score.

Authors

Evan Harris

License

This project is licensed under the MIT License - see the LICENSE file for details

Project details

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python
Topic
- Scientific/Engineering :: Artificial Intelligence

Release history Release notifications | RSS feed

This version

0.1.16

Dec 4, 2019

0.1.15

Aug 30, 2019

0.1.14

Aug 30, 2019

0.1.13

Aug 30, 2019

0.1.12

Aug 14, 2019

0.1.11

Aug 14, 2019

0.1.9

Nov 1, 2017

0.1.8

Oct 31, 2017

0.1.6

Oct 25, 2017

0.1.5

Oct 25, 2017

0.1.4

Oct 7, 2017

0.1.3

Oct 7, 2017

0.1.2

Oct 7, 2017

0.1.1

Oct 7, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit-ext-0.1.16.tar.gz (14.5 kB view details)

Uploaded Dec 4, 2019 Source

File details

Details for the file scikit-ext-0.1.16.tar.gz.

File metadata

Download URL: scikit-ext-0.1.16.tar.gz
Upload date: Dec 4, 2019
Size: 14.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0.post20191101 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.7.3

File hashes

Hashes for scikit-ext-0.1.16.tar.gz
Algorithm	Hash digest
SHA256	`01b75c82bc6a59fcb78cb3bfedbc76d021bf4f1427ecdda091766a3e6101a383`
MD5	`6407c5c040859e43f934c181e37fd1e0`
BLAKE2b-256	`a55580ff2a062a2fae2df5975dd2bbe1bfc71cc997dc210043c30ceb54469525`

See more details on using hashes here.

scikit-ext 0.1.16

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

About

Installation

Estimators

Scorers

Authors

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes