A small package for all useful ML things
Project description
Kowalsky, analysis!
A simple package for handful ML things and more.
What's new? [v0.0.38]
- add
featurepackage with two types of analysis + support для остальных функций- Recursive Feature Elimination
- Sequential Feature Selection
- improve optimize:
EarlyStoppingmechanism- optimization graph
- multitasks with
n_jobs=-1
- add
logspackage
What's inside?
-
analysis- method for evaluation of specified model with given dataframe. Withexport_test_set=Trueit exports ready for submission predictions. -
df - module for working with dataframe:
corr- sort all correlated features.handle_outliers- fill or drop columns with outliers.log_transform- transform columns with log function.group_by_mean- make additional columns with aggregated meangroup_by_max- make additional columns with aggregated maxgroup_by_min- make additional columns with aggregated minapply_with_progress- apply heavy function for each row of dataset.scale- scale columns with Standard of MinMax scalers
-
kaggle:
submit- make submit-file for kaggle based on sample
-
logs:
profile_memory- logs all heavy variablesmake_pretty_pyplot- makes pyplot look better :)
-
optuna - handful methods for working with optuna:
optimize- optimize model with given dataframeoptimize_super_learner- optimize super learner configuration with given set of models and set of heads (meta_model)
-
colab:
csv- read csv file located at Google Drive with specified idpath- get path to Google Drive file
-
feature:
rfe_analysis- Recursive Feature Elimination analysissfs_analysis- Sequential Feature Selection analysis
What's next?
- Use
optunafor searching the best feature amount - Add file logger to track the progress in
JupterLab
Example:
!pip install kowalsky --upgrade
from kowalsky.optuna import optimize
optimize('RFR',
path='../input/project/feed.csv',
scorer='acc',
y_label='y_label',
trials=3000)
Avaliable models:
Gradient Boosts
'xgbR': XGBRegressor
'xgbC': XGBClassifier
'lgbR': LGBMRegressor
'lgbC': LGBMClassifier
Trees
'rfR': RandomForestRegressor
'rfC': RandomForestClassifier
'dtR': DecisionTreeRegressor
'dtC': DecisionTreeClassifier
'etR': ExtraTreeRegressor
'etC': ExtraTreeClassifier
Ensemble
'baggC': BaggingClassifier
'baggR': BaggingRegressor
'adaR': AdaBoostRegressor
'adaC': AdaBoostClassifier
'cbR': CatBoostRegressor
'cbC': CatBoostClassifier
KNeighbors
'knC': KNeighborsClassifier
'knR': KNeighborsRegressor
SVM
'svR': SVR
'svC': SVC
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kowalsky-0.0.38.tar.gz
(9.5 kB
view details)
File details
Details for the file kowalsky-0.0.38.tar.gz.
File metadata
- Download URL: kowalsky-0.0.38.tar.gz
- Upload date:
- Size: 9.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42512cc8f7d2c52ceddbd653e0e2183570f034b73a99c5310f4d1f50e0c59cc4
|
|
| MD5 |
f71fd633bd8af9a0b794eb9a32da7be8
|
|
| BLAKE2b-256 |
27f791cdbae8d230a0b69e3925b9f0a2eb28159663914e32a01e4dc0f93c3ea0
|