A package for selecting ensemble members using entropy theory

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Project description

En-EMS | Entropy-based Ensemble Members Selection

en-ems is a Python library for the selection of a set of mutually exclusive, collectivelly exaustive (MECE) ensemble members.

The library implements the approach presented by Darbandsari and Coulibaly (2020) as step that antecedes the further merging of a set of ensemble forecasts.

The en-ems package is built over the pyitlib package, which implements fundamental information theory methods.

Installing

The library can be installed using the traditional pip:

pip install en-ems

And is listed on the Python Package Index (pypi) as en-ems.

Using

Suppose you have a file named example.csv with the following content:

Date,       Memb_A, Memb_B, ...,  Memb_Z, Obsv
2020/05/15, 1.12,   1.05,   ...,  0.5,    1.01
2020/05/16, 1.15,   1.12,   ...,  0.9,    1.10
2020/05/17, 1.13,   1.32,   ...,  1.1,    1.29
...         ...     ...     ...,  ...,    ...
2020/11/30, 1.22,   0.95,   ...,  0.3,    0.87

In which the columns starting with "Memb_" hold the realization of one ensemble member for the time interval and "Obsv" holds the observed values for the same time interval.

If your our objective is to select a MECE set considering obaservations, it can be done using the standard parameters by:

import pandas as pd
import enems

# read file
data_ensemble = pd.read_csv("example.csv").to_dict('list')
data_obsv = data_ensemble["Obsv"]
del data_ensemble["Obsv"], data_ensemble["Date"]

# perform selection
selection_log = enems.select_ensemble_members(data_ensemble, data_obsv)

The variable selection_log will be a dictionary containing a log of the total correlation, joint antropy and (if an observation was given) the transinformation of the given and selected datasets. It also contains, as expected, the ids of the selected ensemble members.

Example

Mock data for a dataset with 75 supposed ensemble members and without observation records can be obtained with the function enems.load_data_75().

Here is a full example on how we can access the mock data, select a MECE subset and visualize the results using the popular matplotlib is given:

import matplotlib.pyplot as plt
import enems

if __name__ == "__main__":

    # ## LOAD DATA ################################################################################################### #

    test_data_df = enems.load_data_75()
    test_data = test_data_df.to_dict("list")

    # ## SELECT MECE SUBSET ########################################################################################## #

    selection_log = enems.select_ensemble_members(test_data, None, n_bins=10, bin_by="equal_intervals", 
                                                  beta_threshold=0.95, n_processes=1, verbose=False)

    # ## PLOT FUNCTIONS ############################################################################################## #

    def plot_ensemble_members(all_series: dict, selected_series: set, plot_title: str, output_file_path: str) -> None:
        _, axs = plt.subplots(1, 1, figsize=(7, 2.5))
        axs.set_xlabel("Time")
        axs.set_ylabel("Value")
        axs.set_title(plot_title)
        axs.set_xlim(0, 143)
        axs.set_ylim(0, 5)
        [axs.plot(all_series[series_id], color="#999999", zorder=3, alpha=0.33) for series_id in selected_series]
        plt.tight_layout()
        plt.savefig(output_file_path)
        plt.close()
        return None

    def plot_log(n_total_members: int, log: dict, output_file_path: str) -> None:
        _, axss = plt.subplots(1, 2, figsize=(7.0, 2.5))
        x_values=[n_total_members-i-1 for i in range(len(log["history"]["total_correlation"]))]
        axss[0].set_xlabel("Time")
        axss[0].set_ylabel("Total correlation")
        axss[0].plot(x_values, log["history"]["total_correlation"], color="#7777FF", zorder=3)
        axss[0].set_ylim(70, 140)
        axss[0].set_xlim(x_values[0], x_values[-1])
        axss[1].set_xlabel("Time")
        axss[1].set_ylabel("Joint entropy")
        axss[1].axhline(log["original_ensemble_joint_entropy"], color="#FF7777", zorder=3, label="Full set")
        axss[1].plot(x_values, log["history"]["joint_entropy"], color="#7777FF", zorder=3, label="Selected set")
        axss[1].set_ylim(6.3, 6.9)
        axss[1].set_xlim(x_values[0], x_values[-1])
        axss[1].legend()
        plt.tight_layout()
        plt.savefig(output_file_path)
        plt.close()
        return None

    # ## FUNCTIONS CALL ############################################################################################## #

    plot_log(len(test_data.keys()), selection_log, "test/log.svg")

    plot_ensemble_members(test_data, set(test_data.keys()),
                          "All members (%d)" % len(test_data.keys()),
                          "test/ensemble_all.svg")

    plot_ensemble_members(test_data, selection_log["selected_members"],
                          "Selected members (%d)" % len(selection_log["selected_members"]),
                          "test/ensemble_selected.svg")

Which would give us the following plot:

log.svg

ensemble_all.svg

ensemble_selected.svg

Further documentation

Further information about the library can be found in the docs folder of the Git repository of this project.

The users are can find the complete theoretical explanation and assessment of the method in the original work of Darbandsari and Coulibaly (2020).

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

0.2.2

Nov 20, 2021

This version

0.2.1

Nov 19, 2021

0.2.0

Nov 19, 2021

0.1.2

Nov 3, 2021

0.1.1

Oct 22, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

en_ems-0.2.1-py3-none-any.whl (51.8 kB view details)

Uploaded Nov 19, 2021 Python 3

File details

Details for the file en_ems-0.2.1-py3-none-any.whl.

File metadata

Download URL: en_ems-0.2.1-py3-none-any.whl
Upload date: Nov 19, 2021
Size: 51.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.7.3

File hashes

Hashes for en_ems-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3b523f0cb447d5f3993ab42c47170c85f471f5cbc076db08ebc1892ec95ba714`
MD5	`76f6429aaa93e741dd5a957f74a1662c`
BLAKE2b-256	`5d536f87dc2d073714c41bded6a2de1c5c78718b60d81214e4269aa3367083da`

See more details on using hashes here.

en-ems 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

En-EMS | Entropy-based Ensemble Members Selection

Installing

Using

Example

Further documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes