Skip to main content

A small lib to make reinforcement learning easyer

Project description

made-with-python Generic badge

DeepQ

A reinforcement learning library in Python.

This is a basic reinforcement learning library that works with gym and tensorflow. It uses a reinforcement learning approach to machine learning. This is where the program is rewarded if it does the correct thing and if it does the wrong thing it's punished

To install with pip do pip install DeepQ . The other dependencies you need are tensorflow and numpy

Example

video

The example.py this uses gym (which is a aim training lib), for this example i am using the environment 'LunarLander-v2' which simulates landing a spacecraft on the moon. We then give control to the AI that uses DeepQ's Agent(the spacecraft) which then learns to land it!:

from DeepQ import Agent
import numpy as np
import gym
import tensorflow as tf
import matplotlib.pyplot as plt

if __name__ == '__main__':
    tf.compat.v1.disable_eager_execution()
    env = gym.make('LunarLander-v2')  # loads the lunar lander trainer from gym
    learning_rate = 0.001
    n_games = 500  # this is the number of games to loops through
    agent = Agent(gamma=0.99, epsilon=1.0, lr=learning_rate,
                  input_dims=env.observation_space.shape, n_actions=env.action_space.n, mem_size=1000000,
                  batch_size=64, epsilon_end=0.01)
    scores = []
    epsilon_history = []

    for i in range(n_games):
        done = False
        score = 0
        observation = env.reset()
        while not done:
            action = agent.choose_action(observation)
            observation_, reward, done, info = env.step(action)
            score += reward
            agent.store_transition(observation, action, reward, observation_, done)
            observation = observation_
            agent.learn()
        epsilon_history.append(agent.epsilon)
        scores.append(score)
        avg_score = np.mean(scores[-100:])
        print("Episode:", i, " Score %.2f" % score,
              "Average score %.2f" % avg_score,
              "epsilon %.2f" % agent.epsilon)

    plt.plot(scores)
    plt.title("A graph of the score increase over each episode of learning 'LunarLander-v2'")
    plt.ylabel("score")
    plt.xlabel("Episode")
    plt.show()  # plots the scores

Documentation

The documentation link:

For more info Dm on Discord Madmeg#4882

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DeepQ-0.0.3.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

DeepQ-0.0.3-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file DeepQ-0.0.3.tar.gz.

File metadata

  • Download URL: DeepQ-0.0.3.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5

File hashes

Hashes for DeepQ-0.0.3.tar.gz
Algorithm Hash digest
SHA256 2b541098739d8f081b511499748d695ca1e1f869af766d3e350dd0970128f13a
MD5 dd5e586c6b4f78a8b61603e5c1fa8ab8
BLAKE2b-256 f08dc6faec9aa4520c80658c649d4a4ee3a46b1d5743eb06901c53e9035d7cfc

See more details on using hashes here.

File details

Details for the file DeepQ-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: DeepQ-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5

File hashes

Hashes for DeepQ-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6a7b89cc39de4f822d8d2f907adee512a70480487cf2e7b30936996f6ddf82a3
MD5 0143e202cae9b45beb2d452eb2cdb39d
BLAKE2b-256 ea7fd1e075f2ef12303100bb5817d10d58aff50d466110d6105e40c96134103f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page