The scenario selection algorithm selects a maximal subset of scenarios from a scenario set, so that the selected scenarios have specified means (or sums).
Project description
The scenario selection algorithm selects a maximal subset of scenarios from a scenario set1, so that the selected scenarios have specified means (or sums).
Package Installation
The Python scenarioselector package is published on the Python Package Index, and is hosted on GitHub. The package can be installed with the Package Installer for Python.
pip install scenarioselector
Demonstrations
The easiest way to understand the scenario selection algorithm is to read through, download2 and run two accompanying Jupyter Notebook demos presented with the Jupyter NBViewer.
- Demo 1: Selecting Scenarios from a Monte Carlo Simulation;
- Demo 2: Selecting Scenarios of Bivariate Data.
Basic Usage
The following three steps outline basic usage of the ScenarioSelector class, which constructs instances of scenario selection problems and applies the selection algorithm.
1. Instantiate ScenarioSelector
Construct an object which defines the scenario selection problem you want to solve.
from scenarioselector import ScenarioSelector
selector = ScenarioSelector(data, weights=1, means=0, sums=0)
Variable | Allowable Types | Shape | Default Value | Description |
---|---|---|---|---|
data |
List of lists, NumPy array, or pandas dataframe. | (N, D) | Required parameter. | Scenario set with N scenarios and D variables. |
weights |
Scalar, list or NumPy array. | (N,) | Unit weight for each scenario. | Strictly positive weights for each of the N scenarios. |
means |
Scalar, list or NumPy array. | (D,) | Zero mean for each variable. | Target means for the D variables. |
sums |
Scalar, list or NumPy array. | (D,) | Zero sum for each variable. | Target sums for the D variables. |
Note: Non-zero target values may be specified for either means
or sums
, but not both.
2. Run the Scenario Selection Algorithm
Call the ScenarioSelector's optimize method to run the scenario selection algorithm.
selector.optimize(callback=None, pivot_rule=None)
Note: Calling selector.optimize()
without parameters runs the algorithm with default parameters.
3. View Results
Results of the optimization can be inspected as follows3.
-
selector.selected
is a Numpy array of Booleans which indicates which scenarios have been selected. If the input variabledata
is a NumPy array, then you can use NumPy's Boolean indexing functionality to obtain the selected scenario set asselected_data = data[selector.selected]
, and the associated weights asselected_weights = selector.weights[selector.selected]
.- If you have specified target
means
, the weighted means of the reduced scenario set will be close to your specified target. You can verify this by calculatingnumpy.average(selected_data, weights=selected_weights, axis=0)
. If the original scenario set is equally weighted then you do not need to specify the selected weights. - If you have specified target
sums
, the weighted sums of the reduced scenario set will be close to your specified target. You can verify this by calculatingnumpy.dot(selected_weights, selected_data)
. If each scenario has unit weight then you can get the same result by calculatingnumpy.sum(selected_data, axis=0)
.
- If you have specified target
-
selector.reduced_weights
is a NumPy array of reduced weights associated with each scenario. You can verify the algorithm has hit thesums
target precisely by calculatingnumpy.dot(selector.reduced_weights, data)
. -
selector.probabilities
is an NumPy array of probabilities associated with each scenario. You can verify the algorithm has hit themeans
target precisely by calculatingnumpy.dot(selector.probabilities, data)
.
Example of Basic Usage
The following is an example of basic usage with N = 5 and D = 2.
Consider a finite discrete probability space, (Ω, P), where Ω := {ω1, ω2, ω3, ω4, ω5} and the probabilities of each outcome are p1 = P(ω1) = 0.15, p2 = P(ω2) = 0.25, p3 = P(ω3) = 0.2, p4 = P(ω4) = 0.25 and p5 = P(ω5) = 0.15.
Consider an R2-valued random variable X with five realizations X(ω1) = (0.8, -3.2), X(ω2) = (3.0, 2.9), X(ω3) = (3.0, 2.5), X(ω4) = (-0.8, 1.0) and X(ω5) = (0.8, -2.0).
Suppose we want to select a maximal subset of the five scenarios, so that the weighted sum of the outcomes X(ωn) selected scenarios is equal to (1.1, 1.0). More precisely, we want to find reduced weights 0 ≤ qn ≤ pn which maximize Σn qn, subject to the constraint Σn qn X(ωn) = (1.1, 1.0).
We define an array of shape (5, 2) which holds the scenario set data.
from scenarioselector import ScenarioSelector
import numpy as np
data = np.array([[0.8, -3.2], [3.0, 2.9], [3.0, 2.5], [-0.8, -1.0], [0.8, -2.0]])
weights = [0.15, 0.25, 0.2, 0.25, 0.15]
sums = [1.1, 1.0]
selector = ScenarioSelector(data, weights=weights, sums=sums)
print()
print("Before optimization")
print("-------------------")
print(sum(selector.selected), "scenarios selected: ", selector.selected)
print("Exact sums:", np.dot(selector.reduced_weights, data))
print("Approx sums:", np.dot(selector.weights[selector.selected], data[selector.selected]))
selector.optimize()
print()
print("After optimization")
print("------------------")
print(sum(selector.selected), "scenarios selected: ", selector.selected)
print("Exact sums:", np.dot(selector.reduced_weights, data))
print("Approx sums:", np.dot(selector.weights[selector.selected], data[selector.selected]))
Note: Python uses zero-based array indices so, for example, data[1]
evaluates to [3.0, 2.9]
.
Advanced Usage
ScenarioSelector Properties
A ScenarioSelector object has the following properties, which can be queried at any stage of the optimization.
Property | Type | Shape | Description |
---|---|---|---|
selected |
NumPy array | (N,) | Booleans, indicating which scenarios are selected. |
reduced_weights |
NumPy array | (N,) | Reduced weights associated with each scenario. |
probabilities |
NumPy array | (N,) | Probabilities associated with each scenario. |
lagrange_multiplier |
Numpy array | (D,) | Lagrange multiplier for the dual problem. |
tableau |
pandas dataframe | Condensed tableau for the simplex algorithm. | |
pivot_count |
int | Number of pivots operations used to get to the current state. |
Callback Function
The scenario selector's optimize method can be parameterized with a bespoke callback function. For example,
tableaus = []
def callback(selector, i, element):
print("Iteration {} pivots on element {}.".format(i, element))
tableaus.append(selector.tableau)
To keep track of the optimization progress, call the ScenarioSelector's optimize method with the callback function as a parameter.
selector.optimize(callback=callback)
Pivot Rule
A pivot rule determines which variable and scenario(s) to use for pivot and flip operations in the modified simplex algorithm.
from scenarioselector.pivot_rule import PivotRule, PivotRuleSlowed
from scenarioselector.pivot_variable import (Dantzig, DantzigTwoPhase,
MaxObjectiveImprovement, MaxObjectiveImprovementTwoPhase)
from scenarioselector.pivot_scenarios import pivot_scenarios, barrodale_roberts
pivot_rule = PivotRule(pivot_variable=DantzigTwoPhase, pivot_scenarios=barrodale_roberts)
selector.optimize(pivot_rule=pivot_rule)
The choices of pivot variable and pivot scenario(s) are discussed in the next two subsections.
Note: The derived pivot rule PivotRuleSlowed
is designed specifically for use with the Barrodale Roberts improvement. This rule slows down the effect of passing through each vertex in succession, and is included only for demonstration purposes.
Pivot Variable
A pivot_variable
rule determines which variable to use for the next pivot operation. Pre-defined pivot_variable
rules can be summarised as follows.
Rule | Description |
---|---|
Dantzig |
Choose the variable whose corresponding entry in the basement row of the condensed tableau has the largest magnitude. |
DantzigTwoPhase |
Similar to Dantzig , however the first D operations move all the Lagrange multiplier variables into the basis. |
MaxObjectiveImprovement |
Choose the variable such that a classical pivot operation will lead to the largest improvement in the objective value. |
MaxObjectiveImprovementTwoPhase |
Similar to MaxObjectiveImprovement , however the first D operations move all the Lagrange multiplier variables into the basis. |
Pivot Scenarios
A pivot_scenarios
rule determines which scenario(s) to use for the next pivot and associated flip operations. The Barrodale Roberts improvement allows the modified simplex algorithm to pass through multiple vertices at once, allowing the algorithm to flip an array of selection states in a single operation.
Footnotes
- A scenario set is a set of (possibly weighted) observations of multi-variate data.
- The example notebooks are located in a separate project which is also hosted on GitHub.
- This section assumes you have imported NumPy with the statement
import numpy
.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scenarioselector-0.1.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 550405464eb001e82e64d6c95a862adb30d3123ba4eb31c48efd93f1cbb15cbc |
|
MD5 | f2088f34e06c24b1184caad2fede6947 |
|
BLAKE2b-256 | cacf804e58843fbd0e79652557cf44f299c90fc36aed90ed0fcb415ed507557f |