Skip to main content

Hindsight - Pytorch

Project description

Multi-Modality

Hindsight Experience Replay (HER)

=================================

Hindsight Experience Replay (HER) is a reinforcement learning technique that makes use of failed experiences to learn how to achieve goals. It does this by storing additional transitions in the replay buffer where the goal is replaced with the achieved state. This allows the agent to learn from a hindsight perspective, as if it had intended to reach the achieved state from the beginning.

Implementation


This repository contains a Python implementation of HER using PyTorch. The main class is HindsightExperienceReplay, which represents a replay buffer that stores transitions and allows for sampling mini-batches of transitions.

The HindsightExperienceReplay class takes the following arguments:

  • state_dim: The dimension of the state space.
  • action_dim: The dimension of the action space.
  • buffer_size: The maximum size of the replay buffer.
  • batch_size: The size of the mini-batches to sample.
  • goal_sampling_strategy: A function that takes a tensor of goals and returns a tensor of goals. This function is used to dynamically sample goals for replay.

The HindsightExperienceReplay class has the following methods:

  • store_transition(state, action, reward, next_state, done, goal): Stores a transition and an additional transition where the goal is replaced with the achieved state in the replay buffer.
  • sample(): Samples a mini-batch of transitions from the replay buffer and applies the goal sampling strategy to the goals.
  • __len__(): Returns the current size of the replay buffer.

Usage


Here is an example of how to use the HindsightExperienceReplay class:

# Define a goal sampling strategy
def goal_sampling_strategy(goals):
    noise = torch.randn_like(goals) * 0.1
    return goals + noise

# Define the dimensions of the state and action spaces, the buffer size, and the batch size
state_dim = 10
action_dim = 2
buffer_size = 10000
batch_size = 64

# Create an instance of the HindsightExperienceReplay class
her = HindsightExperienceReplay(state_dim, action_dim, buffer_size, batch_size, goal_sampling_strategy)

# Store a transition
state = np.random.rand(state_dim)
action = np.random.rand(action_dim)
reward = np.random.rand()
next_state = np.random.rand(state_dim)
done = False
goal = np.random.rand(state_dim)
her.store_transition(state, action, reward, next_state, done, goal)

# Sample a mini-batch of transitions
sampled_transitions = her.sample()
if sampled_transitions is not None:
    states, actions, rewards, next_states, dones, goals = sampled_transitions

In this example, we first define a goal sampling strategy function and the dimensions of the state and action spaces, the buffer size, and the batch size. We then create an instance of the HindsightExperienceReplay class, store a transition, and sample a mini-batch of transitions. The states, actions, rewards, next states, done flags, and goals are returned as separate tensors.

Customizing the Goal Sampling Strategy


The HindsightExperienceReplay class allows you to define your own goal sampling strategy by passing a function to the constructor. This function should take a tensor of goals and return a tensor of goals.

Here is an example of a goal sampling strategy function that adds random noise to the goals:

def goal_sampling_strategy(goals):
    noise = torch.randn_like(goals) * 0.1
    return goals + noise

In this example, the function adds Gaussian noise with a standard deviation of 0.1 to the goals. You can customize this function to implement any goal sampling strategy that suits your needs.

Contributing


Contributions to this project are welcome. If you find a bug or think of a feature that would be nice to have, please open an issue. If you want to contribute code, please fork the repository and submit a pull request.

License


This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hindsight_replay-0.0.1.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

hindsight_replay-0.0.1-py3-none-any.whl (4.3 kB view details)

Uploaded Python 3

File details

Details for the file hindsight_replay-0.0.1.tar.gz.

File metadata

  • Download URL: hindsight_replay-0.0.1.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/22.4.0

File hashes

Hashes for hindsight_replay-0.0.1.tar.gz
Algorithm Hash digest
SHA256 c4137f932e666c91bf9391da3fd0cc84f05bafe7c057d29df03e3cf51177e2cb
MD5 766dc9b03af95e28cc8a372e027418f9
BLAKE2b-256 07f489c9d282316af61b0efae729a44bcc132db6d0f74e25bdd5bc666f9affbf

See more details on using hashes here.

File details

Details for the file hindsight_replay-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for hindsight_replay-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cf45660f941511154e11011a287c8e88219abed8e4764ffff65c9c11bf0753ca
MD5 b374b74e162c3a41c034fb991e9e95b1
BLAKE2b-256 48fa5a5b823bd5ab6bd4eec55124049704bdcf2a55e7a57a5f9f1256ea13d48c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page