Skip to main content

The Cross-Entropy Method for either rare-event sampling or optimization.

Project description

The Cross Entropy Method

The Cross Entropy Method (CE or CEM) is an approach for optimization or rare-event sampling in a given class of distributions {D_p} and a score function R(x).

  • In its sampling version, it is given a reference p0 and aims to sample from the tail of the distribution x ~ (D_p0 | R(x)<q), where q is defined as either a numeric value q or a quantile alpha (i.e. q=q_alpha(R)).
  • In its optimization version, it aims to find argmin_x{R(x)}.

Why to use?

The sampling version is particularly useful for over-sampling of certain properties. For example, you have a parametric pipeline that generates examples for learning, and you wish to learn more from examples that satisfy X, but you're not sure how to generate such ones. The CEM will learn how to tune the parameters of your pipeline to achieve that, while you can easily control the extremety level.

How to use?

Installation: pip install cross-entropy-method.

The exact implementation of the CEM depends on the distributions family {D_p} as defined in the problem. This repo provides a general implementation as an abstract class, where a concrete use requires writing a simple, small inherited class. The attached tutorial.ipynb provides a more detailed background on the CEM and on this package, along with usage examples.

CEM for sampling (left): the mean of the sample distribution (green) shifts from the mean of the original distribution (blue) towards its 10%-tail (orange). CEM for optimization (right): the mean of the sample distribution aims to be minimized. (images from tutorial.ipynb)

Supporting non-stationary score functions

On top of the standard CEM, we also support a non-stationary score function R. This affects the reference distribution of scores and thus the quantile threshold q (if specified as a quantile). Thus, we have to repeatedly re-estimate q, using importance-sampling correction to compensate for the CEM distributional shift.

Application to risk-averse reinforcement learning

In our separate work (available in code and as a NeurIPS paper, with Yinlam Chow, Mohammad Ghavamzadeh and Shie Mannor), we demonstrate the use of the CEM for the more realistic problem of sampling high-risk environment-conditions in risk-averse reinforcement learning. There, D_p determines the distribution of the environment-conditions, p0 corresponds to the original distribution (or test distribution), and R(x; agent) is the return function of the agent given the conditions x. Note that since the agent evolves with the training, the score function is indeed non-stationary.

Cite us

This repo: non-stationary cross entropy method

@misc{cross_entropy_method,
  title={Cross Entropy Method with Non-stationary Score Function},
  author={Ido Greenberg},
  howpublished={\url{https://pypi.org/project/cross-entropy-method/}},
  year={2022}
}

Application to risk-averse reinforcement learning

@inproceedings{cesor,
  title={Efficient Risk-Averse Reinforcement Learning},
  author={Ido Greenberg and Yinlam Chow and Mohammad Ghavamzadeh and Shie Mannor},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cross-entropy-method-0.1.1.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

cross_entropy_method-0.1.1-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file cross-entropy-method-0.1.1.tar.gz.

File metadata

  • Download URL: cross-entropy-method-0.1.1.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for cross-entropy-method-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2175bf2d3dd9e59692e5e55356739ab618c966766b908303624bfc9dd094aa04
MD5 90a1b91a4a5cc804903e815ef20af481
BLAKE2b-256 8bf4252fa47d9ab5c6668c81fcbbd8ea5777e9e35ddc1c56db3d6cb95325746c

See more details on using hashes here.

Provenance

File details

Details for the file cross_entropy_method-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for cross_entropy_method-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 80f5ca211d041179009ebcc4086190cd7d7ff4e25d17d0757852a16603100284
MD5 a303fc1653d49f42568dd59e12a955a0
BLAKE2b-256 d4cc1e01d7455d93fb582205b4e1856ee83889f75f8c265aaf12127d9721586d

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page