Proof of concept of SOTA algorithms in RL

Project description

mercury-rl

Introduction

Welcome to mercury-rl, a library for offline deep reinforcement learning. This library offers classic implementations of state-of-the-art algorithms such as Conservative Q-Learning (CQL) and its variations, including Deep Q-Network (DQN), Actor-Critic (AC), Trust Region Policy Optimization (TRPO), and Proximal Policy Optimization (PPO).

Our goal is to provide a toolkit to develop, experiment, and deploy reinforcement learning models efficiently. mercury-rl aims to make your offline RL journey smoother and more productive.

What is Offline Reinforcement Learning?

Offline reinforcement learning (offline RL) is a subfield of reinforcement learning where the agent is trained using a static dataset of interactions with the environment. On the other hand, in online reinforcement learning the agent continuously interacts with the environment to collect data and update its policy.

Offline RL is particularly useful in scenarios where real-time interaction with the environment is costly, risky, or impractical. Some examples include healthcare, robotics, and autonomous driving, where it is often not feasible to let an untrained agent explore freely.

Key Differences Between Offline RL and Online RL

Aspect	Offline RL	Online RL
Data Collection	Static dataset collected from previous interactions	Collects data through interaction with the environment
Exploration	Does not involve exploration during training; the agent learns from the provided dataset	Requires exploration to improve the policy
Safety and Feasibility	Ideal for applications where exploration is dangerous or impractical	Suitable for environments where real-time feedback and interaction are feasible
Algorithm Complexity	Often requires more sophisticated algorithms to handle the limitations of fixed datasets	Can leverage simpler algorithms due to continuous data collection and real-time feedback

Figure taken from this post

Components

Algorithms

The mercury-rl library implements a range of state-of-the-art algorithms for both discrete and continuous control tasks. Below is a summary of the key algorithms included in the library:

algorithm	discrete control	continuous control
Imitation Learning	:white_check_mark:	:white_check_mark:
Conservative Deep Q-Network (DQN)	:white_check_mark:	:no_entry:
Conservative Double DQN	:white_check_mark:	:no_entry:
Conservative Actor-Critic	:white_check_mark:	:no_entry:

Requirements

mercury-rl development requires the following software installed:

Python 3.6 or higher

Once the developer has checked out the source code from the repository, any changes to the code can be done through the creation of a new branch.

Install

To install mercury-rl you only need a pip-install:

Datio

pip install --user mercury-rl

Local

You'll need to configure your Artifactory credentials. If you don't know how, you can find a mini tutorial on our Mercury’s developer handbook.

pip install mercury-rl --extra-index-url https://\${ARTIFACTORY_BOT_BASIC_AUTH}@artifactory.globaldevtools.bbva.com/artifactory/api/pypi/gl-datio-runtime-pypi-local/simple

Exploratory notebooks

from mercury.rl import create_tutorials

create_tutorials('mercury_tutorials')

The code above creates a local folder named mercury_tutorials and places a collection of notebooks inside showing different mercury.rl features.

Contributing

Want to contribute to mercury-rl? More info about it on Mercury’s developer handbook.

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Dec 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mercury_rl-0.1.0.tar.gz (272.2 kB view details)

Uploaded Dec 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mercury_rl-0.1.0-py3-none-any.whl (303.8 kB view details)

Uploaded Dec 3, 2025 Python 3

File details

Details for the file mercury_rl-0.1.0.tar.gz.

File metadata

Download URL: mercury_rl-0.1.0.tar.gz
Upload date: Dec 3, 2025
Size: 272.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mercury_rl-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`72f88c04da8c00c2cfd986c2b5e8c738492b63608dcdb9cd47c3c18b51056263`
MD5	`9c8e1af5389a2c292b7fa848f2ec1d35`
BLAKE2b-256	`a98cfdfecf69a0e6995fb6d1292a34a667e63a15e14a13073f33276117732ef5`

See more details on using hashes here.

File details

Details for the file mercury_rl-0.1.0-py3-none-any.whl.

File metadata

Download URL: mercury_rl-0.1.0-py3-none-any.whl
Upload date: Dec 3, 2025
Size: 303.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mercury_rl-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ed147c284a4b0bf37a988139821269d9941eb38267d2fe2c269935bb35b66004`
MD5	`39cb5782372f4948b5de4797ab62036c`
BLAKE2b-256	`3b69ab0f2867f2fd2e2b0348951aede2dead27398116f6b79b8e535486b279d9`

See more details on using hashes here.

mercury-rl 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

mercury-rl

Introduction

What is Offline Reinforcement Learning?

Key Differences Between Offline RL and Online RL

Components

Algorithms

Requirements

Install

Datio

Local

Exploratory notebooks

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes