Skip to main content

A library that makes Evolutionary Strategies (ES) simple to use.

Project description

EvoStrat

A library that makes Evolutionary Strategies (ES) simple to use.

Pseudo-code

pop = PopulationImpl(...) # See complete examples for implementations. 
optim = torch.optim.Adam(pop.parameters()) # Use any torch.optim optimizer
for i in range(N):
    optim.zero_grads()
    pop.fitness_grads(n_samples=200) # Computes approximate gradients
    optim.step()

For complete examples that solves 'LunarLander-v2' see examples/normal_lunar_lander.py and examples/binary_lunar_lander.py

Lunar lander

Description

Evolutionary Strategies is a powerful approach to solve reinforcement learning problems and other optimization problems where the gradients cannot be computed with backprop. See "Evolution strategies as a scalable alternative to reinforcement learning" for an excellent introduction.

In ES the objective is to maximize the expected fitness of a distribution of individuals. With a few math tricks this objective can be maximized with gradient ascent, even if the fitness function itself is not differentiable.

This library offers

  1. A plug-and-play implementation of ES for pytorch reinforcement learning agents with torch.nn.Module policy networks, that nicely separates the agent and policy network from the optimization. See examples/normal_lunar_lander.py
  2. A simple and flexible interface for extending ES beyond the standard Normal distribution without having to derive any gradients by hand. Just subclass Population and implement the sampling process. See categorical_population.py and examples/binary_lunar_lander.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evostrat-1.0.1.tar.gz (6.8 kB view hashes)

Uploaded Source

Built Distribution

evostrat-1.0.1-py3-none-any.whl (9.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page