Skip to main content

A set of basic reusable utilities and tools to facilitate quickly getting up and going on any machine learning project.

Project description

cheutils

A set of basic reusable utilities and tools to facilitate quickly getting up and going on any machine learning project.

Features

  • model_options: methods such as get_regressor to get a handle on a configured estimator with a specified parameter dictionary or get_default_grid to get the configured hyperparameter grid
  • model_builder: methods for building and executing ML pipeline steps e.g., fit, predict, score, params_optimization etc.
  • project_tree: methods for accessing the project tree - e.g., get_data_dir() for accessing the configured data and get_output_dir() for the output folders, loading and savings Excel and CSV.
  • common_utils: methods to support common programming tasks, such as labeling or tagging and date-stamping files
  • propertiesutil: utility for managing properties files or project configuration, based on jproperties. The application configuration is expected to be available in a file named app-config.properties, which can be placed anywhere in the project root or any subfolder thereafter.
  • decorator_debug, decorator_timer, and decorator_singleton: decorators for enabling logging and method timers; as well as a singleton decorator

Usage

import cheutils

# retrieve the path to the data folder, which is under the project root
get_data_dir()  # returns the path to the project data folder, which is always interpreted relative to the project root

# the following provide access to the properties file, usually expected to be named "app-config.properties" and typically found in the project data folder or anywhere either in the project root or any other subfolder
# You also have access to the LOGGER - you can simply call LOGGER.debug() in a similar way to you will when using loguru or standard logging
# calling set_prefix on the LOGGER instance ensures the log messages are scoped to that context there after, which can be helpful when reviewing the generated log file (app-log.log) - the default prefix is "app-log".
APP_PROPS = cheutils.AppProperties() # to load the app-config.properties file; 
# Thereafter, you can read any properties using various methods such as:
DATA_DIR = APP_PROPS.get('project.data.dir')
VALUES_LIST = APP_PROPS.get_list('some.configured.list') # e.g., some.configured.list=[1, 2, 3] or ['1', '2', '3']
VALUES_DIC = APP_PROPS.get_dic_properties('some.configured.dict') # e.g., some.configured.dict={'val1': 10, 'val2': 'value'}
BOL_VAL = APP_PROPS.get_bol('some.configured.bol') # e.g., some.configured.bol=True

# You can get a handle to an application logger as follows:
LOGGER = cheutils.LoguruWrapper().get_logger()
# You can set the logger prefix as follows:
LOGGER.set_prefix(prefix='my_project')

# The model_options currently supports the following regressors: Lasso, LinearRegression, Ridge, GradientBoostingRegressor, XGBRegressor, LGBMRegressor, DecisionTreeRegressor, RandomForestRegressor
# You can configure any of the models for your project with an entry in the app-config.properties as follows:
model.active.model_option=xgb_boost # with default parameters
# You can get a handle to the corresponding regressor as follows:
regressor = cheutils.get_regressor(model_option='xgb_boost')
# You can also configure the following property for example:
model.param_grids.xgb_boost={'learning_rate': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 10}, 'subsample': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 10}, 'min_child_weight': {'type': float, 'start': 0.1, 'end': 1.0, 'num': 10}, 'n_estimators': {'type': int, 'start': 10, 'end': 400, 'num': 10}, 'max_depth': {'type': int, 'start': 3, 'end': 17, 'num': 5}, 'colsample_bytree': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 5}, 'gamma': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 5}, 'reg_alpha': {'type': float, 'start': 0.0, 'end': 1.0, 'num': 5}, }
# Thereafter, you can do the following:
regressor = get_regressor(**get_params(model_option='xgb_boost'))
# Thereafter, you can simply fit the model as follows:
fit(regressor, X_train, y_train)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cheutils-2.1.10.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

cheutils-2.1.10-py3-none-any.whl (31.5 kB view details)

Uploaded Python 3

File details

Details for the file cheutils-2.1.10.tar.gz.

File metadata

  • Download URL: cheutils-2.1.10.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.12

File hashes

Hashes for cheutils-2.1.10.tar.gz
Algorithm Hash digest
SHA256 06cc2c00667094f7119f3ee6ebdf5365a5ceadfa3b8185f2c2fc3f9c166326ad
MD5 290e17c1725b942d6e3bacb603b64c8d
BLAKE2b-256 362728730b1efe280911af834d8ce73bd5ba558375b1109b4b088cca39ce7ccd

See more details on using hashes here.

File details

Details for the file cheutils-2.1.10-py3-none-any.whl.

File metadata

  • Download URL: cheutils-2.1.10-py3-none-any.whl
  • Upload date:
  • Size: 31.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.12

File hashes

Hashes for cheutils-2.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 798ada6f71315cf97481ca82476a6855e3496a71d71433eab2b4a2c3b275a008
MD5 f036ddea1c445c523cdd58ca33bda64a
BLAKE2b-256 e6aa4c5bdb3552a5c73f21355ae48b8e67075531343e90e2bd2e470f785b5233

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page