Common optimization interfaces for RL/num. optimization
Project description
Common Optimization Interfaces
CernML is the project of bringing numerical optimization, machine learning and reinforcement learning to the operation of the CERN accelerator complex.
CernML-COI defines common interfaces that facilitate using numerical optimization and reinforcement learning (RL) on the same optimization problems. This makes it possible to unify both approaches into a generic optimization application in the CERN Control Center.
This repository can be found online on CERN's Gitlab.
Table of Contents
[[TOC]]
Motivation
Several problems in accelerator control can be solved both using reinforcement learning (RL) and numerical optimization. However, both approaches usually slightly differ in their expected input and output:
- Optimizers pick certain points in the phase space of modifiable parameters and evaluate the loss of these parameters. They minimize this loss through multiple evaluations and ultimately yield the optimal parameters.
- RL agents assume that the problem has a certain state, which usually contains the values of all modifiable parameters. They receive an observation (which is usually higher-dimensional than the loss) and calculate a correction of the parameters. This correction yields a certain reward to them. Their goal is to optimize the parameters incrementally by optimzing their corrections for maximal cumulative reward.
More informally, optimizers start from scratch each time they are applied and they yield a point in phase space. RL agents learn once, can be applied many times, and they yield a sequence of deltas in the phase space.
Even more informally, on a given machine, an optimizer performs the state
transition machine.parameters = new_parameters, whereas an RL agent performs
the state transition machine.parameters += corrections iteratively.
This package provides interfaces to implement for problems that should be
compatible both with numerical optimizers and RL agents. It is based on the
Gym environment API and enhances it with the SingleOptimizable interface.
In addition, the output and metadata of the environments is restricted to make the behavior of environments more uniform and compatible to make them more easily visualizable and integrable into a generic machine-optimization application.
Quickstart
Start a Python project. In your setup.cfg or setup.py, add dependencies on
Gymnasium and the COI. Make sure to pick a COI version that is supported by the
application that will optimize your problem.
# setup.cfg
[options]
install_requires =
gymnasium >= 0.29
cernml-coi >= 0.9.0
Then, write a class that implements one or multiple of the optimization interfaces. Finally register it so that an application that imports your package may find it. (See the Parabola example for a more fully featured version of the code below.)
# my_project/__init__.py
import gymnasium as gym
import numpy as np
from cernml import coi
class Parabola(coi.SingleOptimizable, gym.Env):
observation_space = gym.spaces.Box(-2.0, 2.0, shape=(2,))
action_space = gym.spaces.Box(-1.0, 1.0, shape=(2,))
optimization_space = gym.spaces.Box(-2.0, 2.0, shape=(2,))
metadata = {
"render_modes": [],
"cern.machine": coi.Machine.NO_MACHINE,
}
def __init__(self, render_mode=None):
self.render_mode = render_mode
self.pos = np.zeros(2)
self._train = True
def reset(self, *, seed=None, options=None):
super.reset(seed=seed, options=options)
self.pos = self.action_space.sample()
return self.pos.copy()
def step(self, action):
next_pos = self.pos + action
ob_space = self.observation_space
self.pos = np.clip(next_pos, ob_space.low, ob_space.high)
reward = -sum(self.pos ** 2)
terminated = (reward > -0.05) or next_pos not in ob_space
truncated = False
return self.pos.copy(), reward, terminated, truncated, {}
def get_initial_params(self, *, seed=None, options=None):
return self.reset(seed=seed, options=options)
def compute_single_objective(self, params):
ob_space = self.observation_space
self.pos = np.clip(params, ob_space.low, ob_space.high)
loss = sum(self.pos ** 2)
return loss
coi.register("Parabola-v0", entry_point=Parabola, max_episode_steps=10)
Any host application may then import your package and instantiate your optimization problem.
import my_project
from cernml import coi
problem = coi.make("Parabola-v0")
optimize_in_some_way(problem)
Documentation
Inside the CERN network, you can read the package documentation on the Acc-Py documentation server. The same documentation is available outside of CERN via CERN's Gitlab Pages service, though some cross-links to CERN-internal projects may not work. Finally, API documentation is provided through extensive Python docstrings.
Changelog
Stability
This package uses a variant of Semantic Versioning that makes additional promises during the initial development (major version 0): whenever breaking changes to the public API are published, the first non-zero version number will increase. This means that code that uses COI version 0.9.0 will continue to work with version 0.9.1, but may break with version 0.10.0.
License
Except as otherwise noted, this work is licensed under either of GNU Public License, Version 3.0 or later, or European Union Public License, Version 1.2 or later, at your option. See COPYING for details.
Unless You explicitly state otherwise, any contribution intentionally submitted by You for inclusion in this Work (the Covered Work) shall be dual-licensed as above, without any additional terms or conditions.
For full authorship information, see the version control history.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cernml_coi-0.9.6-py3-none-any.whl.
File metadata
- Download URL: cernml_coi-0.9.6-py3-none-any.whl
- Upload date:
- Size: 118.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e72e6b1958fdf591528ef12b4716c9d64b464c73c10b21780da776c91c09f94a
|
|
| MD5 |
ed96fbe066e0a532470c2bbe099e8744
|
|
| BLAKE2b-256 |
cefd31ce1e52bf9614adfb5b1628c682f6dae3e6ce5decd886fb0d98a35522b6
|