A set of basic reusable utilities and tools to facilitate quickly getting up and going on any machine learning project.

These details have not been verified by PyPI

Project links

Project description

cheutils

A set of basic reusable utilities and tools to facilitate quickly getting up and going on any machine learning project.

Features

model_options: methods such as get_estimator to get a handle on a configured estimator with a specified parameter dictionary or get_default_grid to get the configured hyperparameter grid
model_builder: methods for building and executing ML pipeline steps e.g., params_optimization etc.
project_tree: methods for accessing the project tree - e.g., get_data_dir() for accessing the configured data and get_output_dir() for the output folders, loading and savings Excel and CSV.
common_utils: methods to support common programming tasks, such as labeling (e.g., label(file_name, label='some_label')) or tagging and date-stamping files (e.g., datestamp(file_name, fmt='%Y-%m-%d')).
propertiesutil: utility for managing properties files or project configuration, based on jproperties. The application configuration is expected to be available in a file named app-config.properties, which can be placed anywhere in the project root or any subfolder thereafter.
decorator_debug, decorator_timer, and decorator_singleton: decorators for enabling logging and method timers; as well as a singleton decorator

Usage

You import the cheutils module as per usual:

import cheutils

The following provide access to the properties file, usually expected to be named "app-config.properties" and typically found in the project data folder or anywhere either in the project root or any other subfolder

APP_PROPS = cheutils.AppProperties() # to load the app-config.properties file

Thereafter, you can read any properties using various methods such as:

DATA_DIR = APP_PROPS.get('project.data.dir')

You can also retrieve the path to the data folder, which is under the project root as follows:

cheutils.get_data_dir()  # returns the path to the project data folder, which is always interpreted relative to the project root

You can retrieve other properties as follows:

VALUES_LIST = APP_PROPS.get_list('some.configured.list') # e.g., some.configured.list=[1, 2, 3] or ['1', '2', '3']
VALUES_DIC = APP_PROPS.get_dic_properties('some.configured.dict') # e.g., some.configured.dict={'val1': 10, 'val2': 'value'}
BOL_VAL = APP_PROPS.get_bol('some.configured.bol') # e.g., some.configured.bol=True

You also have access to the LOGGER - you can simply call LOGGER.debug() in a similar way to you will when using loguru or standard logging calling set_prefix() on the LOGGER instance ensures the log messages are scoped to that context thereafter, which can be helpful when reviewing the generated log file (app-log.log) - the default prefix is "app-log".

You can get a handle to an application logger as follows:

LOGGER = cheutils.LOGGER.get_logger()

You can set the logger prefix as follows:

LOGGER.set_prefix(prefix='my_project')

The model_options currently supports the following estimators: Lasso, LinearRegression, Ridge, GradientBoostingRegressor, XGBRegressor, LGBMRegressor, DecisionTreeRegressor, RandomForestRegressor You can configure any of the models for your project with an entry in the app-config.properties as follows:

model.active.model_option=xgb_boost # with default parameters

You can get a handle to the corresponding estimator as follows:

estimator = cheutils.get_estimator(model_option='xgb_boost')

You can also configure the following property for example:

model.param_grids.xgb_boost={'learning_rate': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 10}, 'subsample': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 10}, 'min_child_weight': {'type': float, 'start': 0.1, 'end': 1.0, 'num': 10}, 'n_estimators': {'type': int, 'start': 10, 'end': 400, 'num': 10}, 'max_depth': {'type': int, 'start': 3, 'end': 17, 'num': 5}, 'colsample_bytree': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 5}, 'gamma': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 5}, 'reg_alpha': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 5}, }

Thereafter, you can do the following:

estimator = cheutils.get_estimator(**get_params(model_option='xgb_boost'))

Thereafter, you can simply fit the model as follows per usual:

estimator.fit(X_train, y_train)

Given a default model parameter configuration (usually in the properties file), you can generate a promising parameter grid using RandomSearchCV as in the following line. Note that, the pipeline can either be an sklearn pipeline or an estimator. The general idea is that, to avoid worrying about trying to figure out the optimal set of hyperparameter values for a given estimator, you can do that automatically, by adopting a two-step coarse-to-fine search, where you configure a broad hyperparameter space or grid based on the estimator's most important or impactful hyperparameters, and the use a random search to find a set of promising hyperparameters that you can use to conduct a finer hyperparameter space search using other algorithms such as bayesean optimization (e.g., hyperopt or Scikit-Optimize, etc.)

promising_grid = cheutils.promising_params_grid(pipeline, X_train, y_train, grid_resolution=3, prefix='model_prefix')

You can run hyperparameter optimization or tuning as follows (assuming you enabled cross-validation in your configuration or app-conf.properties - e.g., with an entry such as model.cross_val.num_folds=3), if using hyperopt; and if you are running Mlflow experiments and logging, you could also pass an optional mlflow_log=True in the optimization call:

best_estimator, best_score, best_params, cv_results = cheutils.params_optimization(pipeline, X_train, y_train, promising_params_grid=promising_grid, with_narrower_grid=True, fine_search='hyperoptcv', prefix='model_prefix')

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.5.11

Nov 16, 2024

2.5.10

Nov 16, 2024

2.5.9

Nov 16, 2024

2.5.8

Nov 15, 2024

2.5.7

Nov 15, 2024

2.5.6

Nov 13, 2024

2.5.5

Nov 8, 2024

2.5.4

Nov 2, 2024

2.5.3

Nov 2, 2024

2.5.2

Nov 2, 2024

2.5.1

Nov 2, 2024

2.5.0

Nov 2, 2024

2.4.47

Nov 2, 2024

2.4.46

Nov 2, 2024

2.4.45

Nov 2, 2024

2.4.44

Nov 1, 2024

2.4.43

Nov 1, 2024

2.4.42

Nov 1, 2024

2.4.41

Nov 1, 2024

2.4.40

Nov 1, 2024

2.4.39

Nov 1, 2024

2.4.38

Nov 1, 2024

2.4.37

Nov 1, 2024

2.4.36

Nov 1, 2024

2.4.35

Nov 1, 2024

2.4.34

Nov 1, 2024

2.4.33

Nov 1, 2024

2.4.32

Nov 1, 2024

2.4.31

Nov 1, 2024

2.4.30

Nov 1, 2024

2.4.29

Oct 31, 2024

2.4.28

Oct 31, 2024

2.4.27

Oct 31, 2024

2.4.26

Oct 31, 2024

2.4.25

Oct 31, 2024

2.4.24

Oct 31, 2024

2.4.23

Oct 31, 2024

2.4.22

Oct 31, 2024

2.4.21

Oct 31, 2024

2.4.20

Oct 31, 2024

2.4.19

Oct 31, 2024

2.4.18

Oct 31, 2024

2.4.17

Oct 31, 2024

2.4.16

Oct 31, 2024

2.4.15

Oct 31, 2024

2.4.14

Oct 31, 2024

2.4.13

Oct 31, 2024

2.4.12

Oct 31, 2024

2.4.11

Oct 31, 2024

2.4.10

Oct 31, 2024

2.4.9

Oct 31, 2024

2.4.8

Oct 31, 2024

2.4.7

Oct 31, 2024

2.4.6

Oct 31, 2024

2.4.5

Oct 31, 2024

2.4.4

Oct 31, 2024

2.4.3

Oct 31, 2024

2.4.2

Oct 31, 2024

2.4.1

Oct 31, 2024

2.4.0

Oct 31, 2024

2.3.25

Oct 31, 2024

2.3.24

Oct 31, 2024

2.3.23

Oct 31, 2024

2.3.22

Oct 31, 2024

2.3.21

Oct 31, 2024

2.3.20

Oct 31, 2024

2.3.19

Oct 30, 2024

2.3.18

Oct 30, 2024

2.3.17

Oct 30, 2024

2.3.16

Oct 30, 2024

2.3.15

Oct 30, 2024

2.3.14

Oct 30, 2024

2.3.13

Oct 30, 2024

2.3.12

Oct 30, 2024

2.3.11

Oct 30, 2024

2.3.10

Oct 30, 2024

2.3.9

Oct 30, 2024

2.3.8

Oct 30, 2024

2.3.7

Oct 30, 2024

2.3.6

Oct 30, 2024

2.3.5

Oct 30, 2024

2.3.4

Oct 30, 2024

2.3.3

Oct 30, 2024

2.3.2

Oct 30, 2024

2.3.1

Oct 30, 2024

2.3.0

Oct 30, 2024

2.2.50

Oct 30, 2024

2.2.49

Oct 30, 2024

2.2.48

Oct 30, 2024

2.2.47

Oct 30, 2024

2.2.46

Oct 30, 2024

2.2.45

Oct 30, 2024

2.2.44

Oct 30, 2024

2.2.43

Oct 30, 2024

2.2.42

Oct 30, 2024

2.2.41

Oct 30, 2024

2.2.40

Oct 30, 2024

2.2.39

Oct 30, 2024

2.2.38

Oct 30, 2024

2.2.37

Oct 30, 2024

2.2.36

Oct 30, 2024

2.2.35

Oct 30, 2024

2.2.34

Oct 30, 2024

2.2.33

Oct 29, 2024

2.2.32

Oct 29, 2024

2.2.31

Oct 29, 2024

2.2.30

Oct 29, 2024

This version

2.2.29

Oct 29, 2024

2.2.28

Oct 29, 2024

2.2.27

Oct 29, 2024

2.2.26

Oct 29, 2024

2.2.25

Oct 29, 2024

2.2.24

Oct 28, 2024

2.2.23

Oct 28, 2024

2.2.22

Oct 28, 2024

2.2.21

Oct 28, 2024

2.2.20

Oct 28, 2024

2.2.19

Oct 28, 2024

2.2.18

Oct 28, 2024

2.2.17

Oct 28, 2024

2.2.16

Oct 28, 2024

2.2.15

Oct 28, 2024

2.2.14

Oct 28, 2024

2.2.13

Oct 28, 2024

2.2.12

Oct 28, 2024

2.2.11

Oct 28, 2024

2.2.10

Oct 28, 2024

2.2.9

Oct 28, 2024

2.2.8

Oct 28, 2024

2.2.7

Oct 28, 2024

2.2.6

Oct 28, 2024

2.2.5

Oct 27, 2024

2.2.4

Oct 27, 2024

2.2.3

Oct 27, 2024

2.2.2

Oct 27, 2024

2.2.1

Oct 27, 2024

2.1.27

Oct 27, 2024

2.1.26

Oct 27, 2024

2.1.25

Oct 27, 2024

2.1.24

Oct 27, 2024

2.1.23

Oct 27, 2024

2.1.22

Oct 27, 2024

2.1.21

Oct 27, 2024

2.1.20

Oct 27, 2024

2.1.19

Oct 27, 2024

2.1.18

Oct 27, 2024

2.1.17

Oct 19, 2024

2.1.16

Oct 19, 2024

2.1.15

Oct 4, 2024

2.1.14

Oct 4, 2024

2.1.13

Oct 4, 2024

2.1.12

Oct 3, 2024

2.1.11

Oct 3, 2024

2.1.10

Oct 3, 2024

2.1.9

Oct 2, 2024

2.1.8

Oct 1, 2024

2.1.7

Oct 1, 2024

2.1.6

Oct 1, 2024

2.1.5

Sep 27, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cheutils-2.2.29.tar.gz (44.1 kB view details)

Uploaded Oct 29, 2024 Source

Built Distribution

cheutils-2.2.29-py3-none-any.whl (48.3 kB view details)

Uploaded Oct 29, 2024 Python 3

File details

Details for the file cheutils-2.2.29.tar.gz.

File metadata

Download URL: cheutils-2.2.29.tar.gz
Upload date: Oct 29, 2024
Size: 44.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.7

File hashes

Hashes for cheutils-2.2.29.tar.gz
Algorithm	Hash digest
SHA256	`ce241a065570e40bbcd8b1ff221aa50ac9aa394f704bddb7ab6f51d027c25e2c`
MD5	`e7670d62428609347c34dfa58d4f7d7a`
BLAKE2b-256	`5e91a9fd8d78bc62903656b44c91926da4f23e2db3f06699534b355c54a60d89`

See more details on using hashes here.

File details

Details for the file cheutils-2.2.29-py3-none-any.whl.

File metadata

Download URL: cheutils-2.2.29-py3-none-any.whl
Upload date: Oct 29, 2024
Size: 48.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.7

File hashes

Hashes for cheutils-2.2.29-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bc8ed1018bbaa6b954a2d7d8fa9182bdbdcf6efd53c3f47a2e1dc7aafe5fd0f6`
MD5	`7aa2223acd021a60d6254a7a270c7608`
BLAKE2b-256	`f335dcbedc5918b7c3356caffa5500299db600eb2d4ed9a25ca836f83f0aefea`