A small package for all useful ML things
Project description
Kowalsky, analysis!
A simple package for handful ML things and more.
What's new? [v0.0.33]
- add
featurepackage with two types of analysis + support для остальных функций- Recursive Feature Elimination
- Sequential Feature Selection
- improve optimize:
EarlyStoppingmechanism- optimization graph
- multitasks with
n_jobs=-1
- add
logspackage
What's inside?
-
analysis- method for evaluation of specified model with given dataframe. Withexport_test_set=Trueit exports ready for submission predictions. -
df - module for working with dataframe:
corr- sort all correlated features.handle_outliers- fill or drop columns with outliers.log_transform- transform columns with log function.group_by_mean- make additional columns with aggregated meangroup_by_max- make additional columns with aggregated maxgroup_by_min- make additional columns with aggregated minapply_with_progress- apply heavy function for each row of dataset.scale- scale columns with Standard of MinMax scalers
-
kaggle:
submit- make submit-file for kaggle based on sample
-
metrics:
rmse- RMSE scorerrmsle- RMSLE scorer
-
optuna - handful methods for working with optuna:
optimize- optimize model with given dataframeoptimize_super_learner- optimize super learner configuration with given set of models and set of heads (meta_model)
-
colab:
csv- read csv file located at Google Drive with specified idpath- get path to Google Drive file
-
feature:
rfe_analysis- Recursive Feature Elimination analysissfs_analysis- Sequential Feature Selection analysis
-
logs:
profile_memory- logs all heavy variablesmake_pretty_pyplot- makes pyplot look better :)
Example:
!pip install kowalsky --upgrade
from kowalsky.optuna import optimize
optimize('RFR',
path='../input/project/feed.csv',
scorer='acc',
y_label='y_label',
trials=3000)
Avaliable models:
Gradient Boosts
'XGBR': XGBRegressor
'XGBC': XGBClassifier
'LGBR': LGBMRegressor
'LGBC': LGBMClassifier
Trees
'RFR': RandomForestRegressor
'RFC': RandomForestClassifier
'DTR': DecisionTreeRegressor
'DTC': DecisionTreeClassifier
'ETR': ExtraTreeRegressor
'ETC': ExtraTreeClassifier
Ensemble
'BC': BaggingClassifier
'BR': BaggingRegressor
'ADAR': AdaBoostRegressor
'ADAC': AdaBoostClassifier
'CBR': CatBoostRegressor
'CBC': CatBoostClassifier
KNeighbors
'KNC': KNeighborsClassifier
'KNR': KNeighborsRegressor
SVM
'SVR': SVR
'SVC': SVC
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kowalsky-0.0.33.tar.gz
(9.6 kB
view details)
File details
Details for the file kowalsky-0.0.33.tar.gz.
File metadata
- Download URL: kowalsky-0.0.33.tar.gz
- Upload date:
- Size: 9.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
120a76ff2ebb0a85a95ace155ec3127264a489662a5ec3570f319dc65417e5bc
|
|
| MD5 |
b14b22a735aaea213a7016beb1e557dd
|
|
| BLAKE2b-256 |
0f9a229eb97cea51ce7c1eadece5ab8768d2ae5ddcc4bd039714fafc222dc522
|