Gymnasium-compatible RL environments for therapeutic peptide design
Project description
peptidegym
Gymnasium-Compatible RL Environments for Therapeutic Peptide Design
PeptideGym provides the first Gymnasium-compatible reinforcement learning environments for therapeutic peptide design. It models peptide construction as a sequential decision process — an RL agent builds peptide sequences residue-by-residue, receiving rewards from pluggable biophysical property predictors. PeptideGym enables researchers to benchmark any Gymnasium-compatible RL algorithm (PPO, DQN, SAC via Stable Baselines3, CleanRL, or RLlib) on peptide design without writing custom training loops.
Three environment families cover distinct therapeutic peptide classes:
- Antimicrobial peptides (AMPs) — cationic, amphipathic sequences that disrupt microbial membranes
- Cyclic peptides — macrocyclic binders with enhanced stability and oral bioavailability
- Vaccine epitopes — short peptides optimized for MHC-I binding and T-cell recognition
Installation
pip install peptidegym # Core (numpy, gymnasium)
pip install peptidegym[train] # + SB3, PyTorch for RL training
pip install peptidegym[all] # Everything
Development install:
git clone https://github.com/HassDhia/peptidegym.git
cd peptidegym
pip install -e ".[all]"
Quick Start
import gymnasium as gym
import peptidegym
env = gym.make("PeptideGym/AMP-v0")
obs, info = env.reset(seed=42)
for _ in range(100):
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
print(f"Designed peptide: {info['sequence']} (reward: {reward:.3f})")
obs, info = env.reset()
env.close()
Train a PPO Agent
from stable_baselines3 import PPO
import gymnasium as gym
import peptidegym
env = gym.make("PeptideGym/AMP-v0")
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=90_000)
# Evaluate
obs, _ = env.reset()
done = False
while not done:
action, _ = model.predict(obs, deterministic=True)
obs, reward, terminated, truncated, info = env.step(action)
done = terminated or truncated
print(f"Designed AMP: {info['sequence']}, Activity: {info.get('activity_score', 'N/A')}")
Environments
| Environment | Task | Action Space | Observation | Difficulty Tiers |
|---|---|---|---|---|
PeptideGym/AMP-v0 |
Design antimicrobial peptide | Discrete(21) — 20 AAs + STOP |
Sequence + biophysical properties | Easy, Medium, Hard |
PeptideGym/CyclicPeptide-v0 |
Design cyclic peptide binder | Discrete(24) — 20 AAs + 3 cyclization + linear stop |
Sequence + properties + cyclization validity | Easy, Medium, Hard |
PeptideGym/Epitope-v0 |
Optimize vaccine epitope | Discrete(21) — 20 AAs + STOP |
Sequence + HLA encoding + binding estimate | Easy, Medium, Hard |
Each environment is available in three difficulty tiers (e.g., PeptideGym/AMP-Easy-v0, PeptideGym/AMP-Hard-v0) for a total of 9 benchmark configurations.
Architecture
┌─────────────────────────────────────────────────────┐
│ RL Agent (PPO) │
│ via Stable Baselines3 │
└────────────────────┬────────────────────────────────┘
│ action (amino acid or special)
▼
┌─────────────────────────────────────────────────────┐
│ PeptideGym Environment │
│ ┌───────────┐ ┌──────────────┐ ┌───────────┐ │
│ │ AMP-v0 │ │CyclicPep-v0 │ │ Epitope-v0│ │
│ └─────┬─────┘ └──────┬───────┘ └─────┬─────┘ │
│ └───────────────┬┘───────────────┘ │
│ ▼ │
│ Pluggable RewardBackend │
│ ┌──────────────┐ ┌──────────────┐ ┌───────────┐ │
│ │ Heuristic │ │ AMPlify │ │ NetMHCpan │ │
│ │ (default) │ │ (optional) │ │ (optional)│ │
│ └──────────────┘ └──────────────┘ └───────────┘ │
└─────────────────────────────────────────────────────┘
All environments share the Gymnasium API (reset(), step(), observation_space, action_space). Default heuristic reward backends require no external dependencies. Optional backends (AMPlify, NetMHCpan, MHCflurry) can be swapped in for research-grade reward signals.
Paper
The accompanying paper is available at:
Citation
If you use peptidegym in your research, please cite:
@software{dhia2026peptidegym,
author = {Dhia, Hass},
title = {PeptideGym: Gymnasium-Compatible Reinforcement Learning Environments for Therapeutic Peptide Design},
year = {2026},
publisher = {Smart Technology Investments Research Institute},
url = {https://github.com/HassDhia/peptidegym}
}
License
MIT License. See LICENSE for details.
Contact
Hass Dhia -- Smart Technology Investments Research Institute
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file peptidegym-0.1.0.tar.gz.
File metadata
- Download URL: peptidegym-0.1.0.tar.gz
- Upload date:
- Size: 27.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
833b965f4690c13ace3526ed77070f871fad87cd7cb933db792c5f8cc73fe05b
|
|
| MD5 |
de578d25f6e0259f22aeeb25773dbb36
|
|
| BLAKE2b-256 |
baf9e2e18ad0b02d20c031be908bc8f6d1765f77ac0fbe75572fb9dbb15ba690
|
File details
Details for the file peptidegym-0.1.0-py3-none-any.whl.
File metadata
- Download URL: peptidegym-0.1.0-py3-none-any.whl
- Upload date:
- Size: 30.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ff3ace38634f62ecee0a180eff5dd0d3c16ce832941ad2175a397d0be301106
|
|
| MD5 |
60e31f193207dc4e6e47bc6225e3901b
|
|
| BLAKE2b-256 |
351044a1f986a07743690491dd0ac718fc3f205d3927aa4e7d1eb3f111a0cf35
|