A set of basic reusable utilities and tools to facilitate quickly getting up and going on any machine learning project.
Project description
cheutils
A set of basic reusable utilities and tools to facilitate quickly getting up and going on any machine learning project.
Features
- model_options: methods such as get_regressor to get a handle on a configured estimator with a specified parameter dictionary or get_default_grid to get the configured hyperparameter grid
- model_builder: methods for building and executing ML pipeline steps e.g., fit, predict, score, params_optimization etc.
- project_tree: methods for accessing the project tree - e.g., get_data_dir() for accessing the configured data and get_output_dir() for the output folders, loading and savings Excel and CSV.
- common_utils: methods to support common programming tasks, such as labeling or tagging and date-stamping files
- propertiesutil: utility for managing properties files or project configuration, based on jproperties. The application configuration is expected to be available in a file named app-config.properties, which can be placed anywhere in the project root or any subfolder thereafter.
- decorator_debug, decorator_timer, and decorator_singleton: decorators for enabling logging and method timers; as well as a singleton decorator
Usage
You import the cheutils
module as per usual:
import cheutils
The following provide access to the properties file, usually expected to be named "app-config.properties" and typically found in the project data folder or anywhere either in the project root or any other subfolder
APP_PROPS = cheutils.AppProperties() # to load the app-config.properties file
Thereafter, you can read any properties using various methods such as:
DATA_DIR = APP_PROPS.get('project.data.dir')
You can also retrieve the path to the data folder, which is under the project root as follows:
cheutils.get_data_dir() # returns the path to the project data folder, which is always interpreted relative to the project root
You can retrieve other properties as follows:
VALUES_LIST = APP_PROPS.get_list('some.configured.list') # e.g., some.configured.list=[1, 2, 3] or ['1', '2', '3']
VALUES_DIC = APP_PROPS.get_dic_properties('some.configured.dict') # e.g., some.configured.dict={'val1': 10, 'val2': 'value'}
BOL_VAL = APP_PROPS.get_bol('some.configured.bol') # e.g., some.configured.bol=True
You also have access to the LOGGER - you can simply call LOGGER.debug()
in a similar way to you will when using loguru or standard logging
calling set_prefix()
on the LOGGER instance ensures the log messages are scoped to that context thereafter,
which can be helpful when reviewing the generated log file (app-log.log
) - the default prefix is "app-log".
You can get a handle to an application logger as follows:
LOGGER = cheutils.LoguruWrapper().get_logger()
You can set the logger prefix as follows:
LOGGER.set_prefix(prefix='my_project')
The model_options
currently supports the following regressors: Lasso, LinearRegression, Ridge, GradientBoostingRegressor, XGBRegressor, LGBMRegressor, DecisionTreeRegressor, RandomForestRegressor
You can configure any of the models for your project with an entry in the app-config.properties as follows:
model.active.model_option=xgb_boost # with default parameters
You can get a handle to the corresponding regressor as follows:
regressor = cheutils.get_regressor(model_option='xgb_boost')
You can also configure the following property for example:
model.param_grids.xgb_boost={'learning_rate': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 10}, 'subsample': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 10}, 'min_child_weight': {'type': float, 'start': 0.1, 'end': 1.0, 'num': 10}, 'n_estimators': {'type': int, 'start': 10, 'end': 400, 'num': 10}, 'max_depth': {'type': int, 'start': 3, 'end': 17, 'num': 5}, 'colsample_bytree': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 5}, 'gamma': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 5}, 'reg_alpha': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 5}, }
Thereafter, you can do the following:
regressor = cheutils.get_regressor(**get_params(model_option='xgb_boost'))
Thereafter, you can simply fit the model as follows:
cheutils.fit(regressor, X_train, y_train)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cheutils-2.1.16.tar.gz
.
File metadata
- Download URL: cheutils-2.1.16.tar.gz
- Upload date:
- Size: 27.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 61c794af8c25d472b32869acd1c78f0c262fabc8dbdbfa010c8ea57fb6357936 |
|
MD5 | 2d1cab759ceead96a17959ff15fae8fc |
|
BLAKE2b-256 | f060fe147e907fd99a90f94aa1bd4a3241de4d167ee295b6fd4148c3fa43e579 |
File details
Details for the file cheutils-2.1.16-py3-none-any.whl
.
File metadata
- Download URL: cheutils-2.1.16-py3-none-any.whl
- Upload date:
- Size: 31.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1010ed0f3c588d7918e5a801936be93e612c81f71531ea40180ad341734699a |
|
MD5 | a2baea521782ea016bac2e4b0baaf59b |
|
BLAKE2b-256 | a6eb8ef319a8ece598e93db8b1438cec2c8b9d616773a782a94497be4eacf9fc |