Skip to main content

Evolutionary Algorithms in Python

Project description

MEvo


Unittest GitHub license

MEvo is a evolutionary algorithm package, which implements Genetic Algorithm and Evolution Strategies Algorithms. It can be used to form an agent strategy or optimize math problems.

Quick look

Training a cartpole policy by using Genetic Algorithm.

import mevo
import gymnasium
import numpy as np


# define a fitness function to get fitness for every individual
def fitness_fn(ind: mevo.individuals.Individual, conf: dict) -> float:
    ep_r = 0
    env = gymnasium.make('CartPole-v1')
    env.reset()
    for _ in range(2):
        s, _ = env.reset()
        for _ in range(500):  # in one episode
            logits = ind.predict(s)
            a = np.argmax(logits)
            s, r, done, _, _ = env.step(a)
            ep_r += r
            if done:
                break
    return ep_r


# training
with mevo.GeneticAlgoNet(max_size=20, layer_size=[4, 8, 2], drop_rate=0.7, mutate_rate=0.5) as pop:
    for generation in range(40):
        pop.evolve(fitness_fn=fitness_fn)
        print(f"generation={generation}, top_fitness={pop.top.fitness:.f2}")

After only 40 generations in a population (20 individuals), it gives a greate result. Use following code to visualize the learned policy.

# deploy the best individual
env = gymnasium.make('CartPole-v1', render_mode="human")
while True:
    s, _ = env.reset()
    while True:  # in one episode
        logits = pop.top.predict(s)
        a = np.argmax(logits)
        s, _, done = env.step(a)[:3]
        if done:
            break

cartpole

What is MEvo

In the MEvo, the smallest data segment is Gene. A set of gene can be packed into a Chromosome. Chromosomes then formed an Individual. A Population consists of many individuals.

data

This is different from the classical Genetic Algorithm which their chromosome only consists of a 1D array. In other words, chromosome in the MEvo is n-D array with different shape.

data

Why making data shape varied?

Because we want to possibly evolve a neural network which has multiple layers in various shape.

MEvo has two different algorithm families.

  • Genetic Algorithm family
  • Evolution Strategy family

The Genetic Algorithm basically has two steps:

  1. crossover the chromosomes in a population
  2. mutate gene in new generation

The following image shows how crossover on two parents chromosomes, and how to mutate the new chromosome.

crossoverMutation

And MEvo's Evolution Strategy has different two steps:

  1. mutate gene (normal distribution) from one single chromosome
  2. update this chromosome by all mutated results

es

Parallel training

MEvo support parallel training. Simply set n_worker > 1 to unlock your machine power for training on multiple cores. When n_worker=-1 then use all your cores.

Simply replace the pop definition by including n_worker from previous code.

# parallel training
if __name__ == "__main__":
    with mevo.GeneticAlgoNet(max_size=20, layer_size=[4, 8, 2], n_worker=-1, drop_rate=0.7, mutate_rate=0.5) as pop:
        for generation in range(40):
            pop.evolve(fitness_fn=fitness_fn)
            print(f"generation={generation}, top_fitness={pop.top.fitness:.2f}")

Note that the parallel code must be run under if __name__ == "__main__": scope, otherwise a python multiprocessing err will occur.

Install

Using pip to install MEvo

pip3 install mevo

Methods and Parameters

MEvo support following populations:

  • mevo.GeneticAlgoInt()
  • mevo.GeneticAlgoFloat()
  • mevo.GeneticAlgoOrder()
  • mevo.GeneticAlgoNet()
  • mevo.EvolutionStrategyNet()

A classical genetic algorithm problem

such as Travel Sales Problem (TSP) can be solved by following classical Populations:

  • mevo.GeneticAlgoInt()
  • mevo.GeneticAlgoFloat()
  • mevo.GeneticAlgoOrder()

A Travel Sales Problem example using pop = mevo.GeneticAlgoOrder():

import mevo
import numpy as np

positions = [np.random.rand(2) for _ in range(20)]


def distance_fitness_fn(ind: mevo.individuals.Individual, conf: dict) -> float:
    order = [c.data[0] for c in ind.chromosomes]
    cost = 0
    for i in range(len(order) - 1):
        p1, p2 = positions[order[i]], positions[order[i + 1]]
        cost += np.square(p1 - p2).sum()
    fitness = -cost
    return fitness


pop = mevo.GeneticAlgoOrder(
    max_size=50,
    chromo_size=len(positions),
    drop_rate=0.3,
    mutate_rate=0.01,
)
pop.run(fitness_fn=distance_fitness_fn, step=30)

An optimization problem example using pop = mevo.GeneticAlgoInt():

import mevo
import numpy as np


def wave_fitness_fn(ind: mevo.individuals.Individual, conf: dict) -> float:
    binary = [c.data for c in ind.chromosomes]
    c = np.concatenate(binary, axis=0)
    a = 2 ** np.arange(len(c))[::-1]
    decimal = c.dot(a)

    x = decimal / float(2 ** len(c) - 1) * 5
    o = np.sin(10 * x) * x + np.cos(2 * x) * x
    return o


pop = mevo.GeneticAlgoInt(
    max_size=20,
    chromo_size=10,
    drop_rate=0.3,
    chromo_initializer=mevo.chromosomes.initializers.RandomInt(0, 2),
    mutate_rate=0.01,
)
pop.run(step=20, fitness_fn=wave_fitness_fn)

Deep Net Evolution

For policy learning or Reinforcement learning alternative, the following two methods has their advantages.

  • mevo.GeneticAlgoNet()
  • mevo.EvolutionStrategyNet()
Reinforcement Learning MEvo
Training Has forward and backward propagation Only has forward propagation, but need crossover or mutation operation (lighter than backpropagation)
Exploration Needs carefully set explore policy Different children setting automatically ensure the exploration
Memory needs Can only keep one set of parameters In each generation, must compute all children's parameters (parallel computing save time)
Network Size Generally large and deep net With a large scale exploration, a relatively small net can perform a good job

A mevo.GeneticAlgoNet() example:

import mevo
import gymnasium
import numpy as np


def fitness_fn(ind: mevo.individuals.Individual, conf: dict) -> float:
    ep_r = 0
    env = gymnasium.make('Pendulum-v1')
    env.reset(seed=conf["seed"])
    for _ in range(2):
        s, _ = env.reset()
        for _ in range(150):  # in one episode
            logits = ind.predict(s)
            a = np.tanh(logits) * 2
            s, r, _, _, _ = env.step(a)
            ep_r += r
    return ep_r


def train():
    with mevo.GeneticAlgoNet(
            max_size=30,
            layer_size=[3, 32, 1],
            drop_rate=0.7,
            mutate_rate=0.5,
            n_worker=-1,
    ) as pop:
        for ep in range(700):
            pop.evolve(
                fitness_fn=fitness_fn,
            )
            print(ep, pop.top.fitness)
    return pop.top


def show(top):
    env = gymnasium.make('Pendulum-v1', render_mode="human")
    while True:
        s, _ = env.reset()
        for _ in range(200):  # in one episode
            logits = top.predict(s)
            a = np.tanh(logits) * 2
            s, _, _, _, _ = env.step(a)


if __name__ == "__main__":
    top = train()
    show(top)

A mevo.EvolutionStrategyNet() example:

import mevo
import gymnasium
import numpy as np


def fitness_fn(ind: mevo.individuals.EvolutionStrategyDense, conf: dict) -> float:
    ep_r = 0
    seed = conf["seed"]
    index = conf["index"]

    env = gymnasium.make('Pendulum-v1')

    # ! must set seed and clone when using mevo.EvolutionStrategyNet()
    env.reset(seed=conf["seed"])
    c_ind = ind.clone_with_mutate(index, seed)
    # ###############

    for _ in range(2):
        s, _ = env.reset()
        for _ in range(100):
            logits = c_ind.predict(s)
            a = np.tanh(logits) * 2
            s, r, _, _, _ = env.step(a)
            ep_r += r
    return ep_r


def train():
    with mevo.EvolutionStrategyNet(
            max_size=15,
            layer_size=[3, 32, 1],
            mutate_strength=0.05,
            learning_rate=0.1,
            n_worker=-1,
            seed=2
    ) as pop:
        for ep in range(700):
            pop.evolve(fitness_fn=fitness_fn)
            print(ep, pop.top.fitness)
    return pop.top


def show(top):
    env = gymnasium.make(
        'Pendulum-v1',
        render_mode="human"
    )
    while True:
        s, _ = env.reset()
        for _ in range(200):  # in one episode
            logits = top.predict(s)
            a = np.tanh(logits) * 2
            s, _, _, _, _ = env.step(a)


if __name__ == "__main__":
    top = train()
    show(top)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mevo-0.0.2.tar.gz (17.4 kB view hashes)

Uploaded Source

Built Distribution

mevo-0.0.2-py3-none-any.whl (21.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page