Google DeepMind Robotics Interfaces and utilities.
Project description
gdm_robotics: The Google DeepMind Robotics interfaces
This package describes a set of interfaces for Python reinforcement learning (RL) environments. It consists of the following core components:
gdm_robotics.interfaces.Environment: An abstract base class for RL environments.gdm_robotics.interfaces.Policy: An abstract base class for Agent policies.gdm_robotics.interfaces.EpisodicLogger: An abstract base class for loggers for Agent/Environment interaction.gdm_robotics.runtime.RunLoop: A concrete RunLoop class to run a policy against an environment and logging their interaction.
The core classes
The Environment interface.
The Environment interface is very similar to (and indeed, it inherits from)
the DeepMind Environment interface.
It provides an abstraction for any "controllable" system, as seen from the
perspective of an RL agent.
An Environment mainly exposes two methods:
resetwhich resets the environment to a known state returning the "timestep" representing it, andstepwhich applies a given action and returns a new "timestep".
A Timestep is a tuple grouping:
observation: the actual observations provided by the environmentreward: the reward associated with this specific step.discount: the discount associated with this specific step.step_type: a value specifying if this is theTimestepreturned by thereset(step_type == FIRST), the last timestep of the episode (step_type == LAST. Users need to check the discount to understand if this is a termination (zero-likediscount) or a truncation (discountdifferent from zero)).
Additionally, an Environment returns specifications (in the shape of
dm_env.specs.Array object) representing the accepted actions (action_spec)
and timestep (timestep_spec).
This Environment abstract class provides typing support and is more strict
than the dm_env equivalent.
The Policy interface.
The Policy interface provides an abstraction for Agent policies. The Policy
interface assumes a stateless Policy, in the sense that the state should be
explicitly provided to the policy when calling its methods.
Note that it is nevertheless possible to have implicit state and make the class stateful.
A Policy should implement:
initial_state: returning the initial state of thePolicy.step: given aTimestepand aPolicystate, generate the next action (and return the nextPolicystate).
Similarly to the Environment a Policy also provides specifications by
implementation of the step_spec method.
The Logger interface
The EpisodicLogger class describes the interface for a logger responsible for
logging the interaction between a Policy and an Environment during a single
episode. As such it exposes:
reset: This logs a "reset"Timestep, i.e. the first timestep of the episode.record_action_and_next_timestep: This records aPolicy's action and the timestep that has been generated by applying this action to theEnvironment.write: Marks an episode as terminated triggering (depending on the implementations) a flush.
The Runloop class
The Runloop is a concrete class that is responsible for running possibly
multiple episodes of a Policy interacting with an Environment.
The Runloop requires at least a single Environment, a single Policy and a
collection of EpisodicLoggers.
When calling run (or run_single_episode) the Runloop will take care of
correctly stepping Policy and Environment and logging the generated data.
Runloop customisation
The Runloop can be customised by passing different options.
- Signal handlers. The
Runloopby default will not intercept anySIGINTand it is responsibility of the caller to handle those. In some cases it might be beneficial to let theRunloophandle that. In that case, passhandle_sigint=Trueto theRunloopinitializer. - Provide reset options to the
Environment. In case yourEnvironmentaccepts reset options (for example because it wraps a Gymnasium environment) you might want to provide options at reset time. In this case you can specify a callable to theinitreset_options_providerargument which will be called before every episode reset. - More complex customisation can be done by using the
RunloopRuntimeOperationsclass. TheRunloopinitializer accepts a collection ofRunloopRuntimeOperationsobjects.
Adapters
Common RL environment libraries such as dm_env.Environment and gymnasium.Env
can be exposed as gdm_robotics.interfaces.Environments by using the provided
environment wrappers in the adapter sub-package, e.g. to wrap a
dm_env.Environment object:
from gdm_robotics.adapters import dm_env_to_gdmr_env_wrapper
original_env: dm_env.Environment = ...
env = dm_env_to_gdmr_env_wrapper.DmEnvToGdmrEnvWrapper(original_env)
Installation
gdm_robotics can be installed from PyPI using pip:
pip install gdm_robotics
Licence and Disclaimer
Copyright 2025 Google LLC
All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the Apache 2.0 license. You may obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0
All other materials are licensed under the Creative Commons Attribution 4.0 International License (CC-BY). You may obtain a copy of the CC-BY license at: https://creativecommons.org/licenses/by/4.0/legalcode
Unless required by applicable law or agreed to in writing, all software and materials distributed here under the Apache 2.0 or CC-BY licenses are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the licenses for the specific language governing permissions and limitations under those licenses.
This is not an official Google product.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gdm_robotics-1.0.2.tar.gz.
File metadata
- Download URL: gdm_robotics-1.0.2.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bb9cb433bcca30a2fa821669ebb17f652faad86310224a98769324f0b6fafe0
|
|
| MD5 |
717a686a70c88a6efba4b55282702af4
|
|
| BLAKE2b-256 |
cc2c8d89c7d367de4b29784b94853dabb1450c601dead91f4a406818bc18d249
|
File details
Details for the file gdm_robotics-1.0.2-py3-none-any.whl.
File metadata
- Download URL: gdm_robotics-1.0.2-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01e315c9e2741d87eec46d8af5c85146c58964168c282e24344b4468af26ad4d
|
|
| MD5 |
77df6818e605e9ae366d03f10a0f5af6
|
|
| BLAKE2b-256 |
3b5373b60db518c3f777e5a5a8758a11b4d939d4fb51764c5e9576a32b89ebe6
|