Skip to main content

ESR DT Model

Project description

ESR_DT_MODEL

This package serves as a hub for consolidating all individual model developments associated with the Digital Twin project. Its primary objective is to generate unified and ensemble-based model outputs, which can be seamlessly integrated into any downstream applications.

1. Install the package

The package can be installed using pip:

pip install esr_dt_model

2. Usage:

This package serves as a repository for preserving modeling development processes and allows for the retrieval of information from previous developments.

The following provides a simple example of using the package:

from pandas import read_csv
from xgboost import XGBRegressor
from esr_dt_model import esr_dt_model

# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'
df = read_csv(url, header=None)

# split training and target dataset
x, y = df.iloc[:, :-1], df.iloc[:, -1].to_frame()

# train the model
model = XGBRegressor()
trained_model = model.fit(x, y)

# save the data:
esr_dt_model.export_model(
    project_name="example",
    user="Sijin",
    trained_models=trained_model,
    training_data=x,
    target_data=y
)

# view the model:
esr_dt_model.view_model()

# load the model:
model = esr_dt_model.import_model(model_version="Simple_example")

2.1 Save model and related dataset:

The model, training dataset and test dataset can be saved as below:

    from esr_dt_model import esr_dt_model
    esr_dt_model.export_model(
        <Project Name>,
        <User>, 
        <Trained Models>, 
        <Training Dataset>, 
        <Target Dataset>)

Where here <Project Name> is the project name, <User> is the user name, <Trained Models> is a trained model, <Training Dataset> is the dataset used for training the model, <Target Dataset> is the target dataset in the training process. Note that project name, user, trained model, training dataset and target dataset are mandatory arguments.

By default, the model and related dataset will be saved in the development channel. When a model is well tested, the model can be saved in the production channel by setting prod to True. For example:

    esr_dt_model.export_model(
        ...
        prod=True)

Also, by default, for each training process, a unique ID will be assigned. However, one can fix the ID by setting the argument of exp_id. For example:

    esr_dt_model.export_model(
        ...
        exp_id="Simple_example")

2.2 List model and related dataset:

We can list all stored model and related dataset as below:

    from esr_dt_model import esr_dt_model
    esr_dt_model.view_model(
        filters = {
            "project_name": [<Project name1>, <Project name2>, ...],
            "datetime_start": <Start date>,
            "datetime_end": <End date>,
        }
    )

The filters here indicates the conditions that we want to put when list the model. The full filters can incldude the arguments including project_name, datetime_start, datetime_end, user, fmt, output_type, for example:

    filters = {
        "project_name": ["DT"],
        "datetime_start": "20231112T0149",
        "datetime_end": "20231112T0250",
        "user": ["Sijin"],
        "fmt": ["pkl", "onnx"],
        "output_type": ["dev", "prod"]
    }

An optional argument key can also be used to specify the columns that you want to view. The full columns include ['project_name', 'version', 'datetime', 'user', 'type', 'fmt', 'output', 'output_type', 'training_data', 'test_data']. By default, all columns will be shown.

2.3 Load the model:

The saved model can be loaded as:

esr_dt_model.load_model("D7QVDT")

where D7QVDT is the model version (a unique ID) that can be obtained from running esr_dt_model.view_model. By default, if we find multiple models share the same model version (e.g., set by exp_id), only the latest version of the model will be returned. However, we can return all models by setting latest argument to False. For example,

esr_dt_model.load_model("D7QVDT", latest=False)

Appendix: Publish the package (for development only)

The package can be published as:

make publish

Note the athe API token must be set up in ~/.pypirc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

esr_dt_model-0.0.4.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

esr_dt_model-0.0.4-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file esr_dt_model-0.0.4.tar.gz.

File metadata

  • Download URL: esr_dt_model-0.0.4.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.0

File hashes

Hashes for esr_dt_model-0.0.4.tar.gz
Algorithm Hash digest
SHA256 b6d97ab2d5a881966f5736fb7d6ca025481f2bfe2b7ce0d61e497f0c698f7fd1
MD5 90009f8262f7618308a0eb3e9041d903
BLAKE2b-256 ee43466a5801aeca568ac2d6828b266fc8018aff9252f76968c064f6a1abf7b6

See more details on using hashes here.

File details

Details for the file esr_dt_model-0.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for esr_dt_model-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 10c7ec9c9e27b0f8dec302113639a7e1cd0023681749e3229cf62ea415e8f56a
MD5 ed4320a3fe5a57170802b19053b327a3
BLAKE2b-256 a2a2321855eac458d9101f65a1691fb4cc3344b0aba493fc180f04e4ea2f7800

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page