A small package for all useful ML things
Project description
Kowalsky, analysis!
A simple package for handful ML things and more.
What's new? [v0.0.20]
- add
apply_with_progressfor ability to track the progress of dataframe transformation - improve
optimize:- EarlyStopping mechanism
- optimization graph
- multitasks with
n_jobs=-1
What's inside?
-
analysis- method for evaluation of specified model with given dataframe. Withexport_test_set=Trueit exports ready for submission predictions. -
df - module for working with dataframe:
corr- sort all correlated features.handle_outliers- fill or drop columns with outliers.log_transform- transform columns with log function.group_by_mean- make additional columns with aggregated meangroup_by_max- make additional columns with aggregated maxgroup_by_min- make additional columns with aggregated minapply_with_progress- apply heavy function for each row of dataset.scale- scale columns with Standard of MinMax scalers
-
kaggle:
submit- make submit-file for kaggle based on sample
-
metrics:
rmse- RMSE scorerrmsle- RMSLE scorer
-
optuna - handful methods for working with optuna:
optimize- optimize model with given dataframeoptimize_super_learner- optimize super learner configuration with given set of models and set of heads (meta_model)
-
colab:
csv- read csv file located at Google Drive with specified idpath- get path to Google Drive file
Example:
!pip install kowalsky --upgrade
from kowalsky.optuna import optimize
from kowalsky.colab import path
optimize('LGBR',
path=path('1mwI9YP8SuDdWl6vU8Yp-Dv-9hgyKR_HS'),
direction='minimize',
scorer='rmsle',
y_label='count',
trials=3000)
Avaliable models:
Gradient Boosts
'XGBR': XGBRegressor
'XGBC': XGBClassifier
'LGBR': LGBMRegressor
'LGBC': LGBMClassifier
Trees
'RFR': RandomForestRegressor
'RFC': RandomForestClassifier
'DTR': DecisionTreeRegressor
'DTC': DecisionTreeClassifier
'ETR': ExtraTreeRegressor
'ETC': ExtraTreeClassifier
Ensemble
'BC': BaggingClassifier
'BR': BaggingRegressor
'ADAR': AdaBoostRegressor
'ADAC': AdaBoostClassifier
'CBR': CatBoostRegressor
'CBC': CatBoostClassifier
KNeighbors
'KNC': KNeighborsClassifier
'KNR': KNeighborsRegressor
SVM
'SVR': SVR
'SVC': SVC
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kowalsky-0.0.20.tar.gz
(8.3 kB
view details)
File details
Details for the file kowalsky-0.0.20.tar.gz.
File metadata
- Download URL: kowalsky-0.0.20.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.6.0.post20210108 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
740286673cca24b2d407c2a1b6b45419b8a0e2773dcb592b8840906ebfd7ac37
|
|
| MD5 |
6b7a29992eae7e8d985e7b8d14b8b73f
|
|
| BLAKE2b-256 |
8cb258b2e69b631d78f8e2eec39515ac4384358eb2a2a1eb09a7e3a154e4cc4b
|