Differentiable episodic memory for reinforcement learning.
Project description
hippotorch
Differentiable episodic memory for reinforcement learning. Retrieves what matters. Forgets what doesn't.
Hippotorch is a drop-in upgrade for replay buffers. It keeps experiences in a learnable memory so agents can remember rare successes, connect distant cause and effect, and transfer knowledge between similar worlds. Under the hood it uses reward-aware contrastive learning, but you mostly interact with a friendly API.
Highlights
- Memory that adapts with you. Dual encoders organize episodes by usefulness instead of mere recency.
- Semantic + uniform sampling. A single buffer can surface hard-to-find wins while still covering the full state space.
- Production-friendly extras. Hugging Face Hub export, FAISS retrieval, Gymnasium wrappers, and health reports ship in the box.
- Batteries included. Dozens of scripts and docs show exactly how to benchmark, visualize, and share results.
If you already converge with a plain replay buffer, keep it. Hippotorch shines when agents forget early lessons, face sparse rewards, or operate in partially observed environments.
Installation
pip install hippotorch # minimal setup
pip install hippotorch[faiss] # fast nearest-neighbor retrieval
pip install hippotorch[envs] # Gymnasium helpers + examples
pip install hippotorch[hub] # Hugging Face Hub + safetensors
pip install hippotorch[umap] # projector UMAP export
Requirements: Python ≥3.9, PyTorch ≥2.0
Quick Tour
Create an encoder + memory, add episodes, then mix semantic and uniform samples:
import torch
from hippotorch import Episode, DualEncoder, MemoryStore, HippocampalReplayBuffer
state_dim, action_dim = 4, 1
encoder = DualEncoder(input_dim=state_dim + action_dim + 1, embed_dim=128)
memory = MemoryStore(embed_dim=128, capacity=50_000)
buffer = HippocampalReplayBuffer(memory=memory, encoder=encoder, mixture_ratio=0.3)
states = torch.randn(32, state_dim)
actions = torch.randn(32, action_dim)
rewards = torch.randn(32)
buffer.add_episode(Episode(states=states, actions=actions, rewards=rewards))
# Query-aware sampling
query_state = torch.cat([states[0], torch.zeros(action_dim), rewards[:1]])
batch = buffer.sample(batch_size=64, query_state=query_state, top_k=5)
# Sleep/consolidate occasionally
metrics = buffer.consolidate(steps=50, batch_size=64, report_quality=True)
print(metrics["loss"])
Rolling with Stable Baselines 3 or Gymnasium? Wrap your existing replay buffer with SB3ReplayBufferWrapper or the HippotorchMemoryWrapper and keep the rest of your pipeline untouched.
Need hyperparameter guidance? See docs/diagnostics.md for health checks and docs/curriculum.md for training tips.
Everyday Tools
Recall While Acting
- Use the lightweight read API:
from hippotorch import query. - Pipe
query(..., top_k=5)results into policies or logging code. - Gymnasium adapter emits dict observations so SB3 policies can consume retrieval features alongside pixels.
- Examples:
examples/query_inference_demo.py,examples/minigrid_memory_wrapper.py.
Portable Brains
- Share trained memories with
push_memory_to_hub/load_memory_from_hub. - Choose local folders for offline passes or Hugging Face Hub for team-wide reuse.
scripts/hub_roundtrip_smoke.pyis a 30-second sanity check.- Docs:
docs/hub.md.
Glass-Box Diagnostics
buffer.health_report()returns retrievability, staleness, collapse indicators, and alignment scores.- Log with
report.to_tensorboard(writer, step)orreport.to_wandb(run). - See
docs/diagnostics.mdfor visuals.
Batch Retrieval for Low Latency
buffer.query_batch(query_vecs, top_k=K)handles[B,T,D]tensors in one go.- Matches single-query results without looping Python.
- Works with both torch and FAISS backends.
Ready-to-Run Samples
Pick a script, set a seed, and you get a reproducible snapshot:
- Benchmarks & diagnostics
- Retrieval perf:
python scripts/bench_retrieval.py --sizes 10000 100000 - Visualization:
python scripts/export_projector_embeddings.py --snapshot run.pt - Retrieval heatmap:
python scripts/retrieval_heatmap.py --memory-checkpoint ...
- Retrieval perf:
- Environments
- CartPole smoke:
bash scripts/quick_cartpole.sh - Corridor curriculum/oracle:
bash scripts/corridor_curriculum.sh,bash scripts/corridor_oracle_zn.sh - MiniGrid sweeps:
python scripts/minigrid_memory_benchmark.py --steps 8000 --seeds 3 - Intrinsic curiosity example:
python -m examples.intrinsic_demo --episodes 20
- CartPole smoke:
- Ablations & studies
- Rank-weighted consolidation:
bash scripts/run_rank_ablation.sh - Consolidation micro bench:
bash scripts/run_consolidation_micro.sh - Visual MiniGrid clustering:
python -m examples.minigrid_visual --steps 2000
- Rank-weighted consolidation:
All scripts keep runtime under a couple of minutes unless stated otherwise. Longer jobs (corridor oracle full run, curriculum sweeps) note their expected duration in the script header.
Learn More
- docs/benchmarks.md – retrieval setups, FAISS parity, and profiling tips.
- docs/curriculum.md – how to stage corridor tasks and measure regret.
- docs/usage.md – wrappers, segmenters, and rollout recipes.
- docs/hub.md – how to move memories between machines or teammates.
Problems or ideas? File an issue, open a discussion, or send a PR.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hippotorch-0.4.1.tar.gz.
File metadata
- Download URL: hippotorch-0.4.1.tar.gz
- Upload date:
- Size: 54.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33b02ca347c75c679682da6ff538d121daf1acfce529d56672f09b5176a7faaf
|
|
| MD5 |
128d1ba3b122eec83a65a82ada3c9db5
|
|
| BLAKE2b-256 |
e96d017a9f3229696c5e73b0c4cb275fdfc999b37d1b0f2b2c20da596ff43648
|
Provenance
The following attestation bundles were made for hippotorch-0.4.1.tar.gz:
Publisher:
workflow.yml on domezsolt/hippotorch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hippotorch-0.4.1.tar.gz -
Subject digest:
33b02ca347c75c679682da6ff538d121daf1acfce529d56672f09b5176a7faaf - Sigstore transparency entry: 969543380
- Sigstore integration time:
-
Permalink:
domezsolt/hippotorch@59a541d5527b8a5863657cd13d264dd6b4b6e8b4 -
Branch / Tag:
refs/tags/v0.4.1 - Owner: https://github.com/domezsolt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@59a541d5527b8a5863657cd13d264dd6b4b6e8b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file hippotorch-0.4.1-py3-none-any.whl.
File metadata
- Download URL: hippotorch-0.4.1-py3-none-any.whl
- Upload date:
- Size: 53.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3c3f6d2b5f4b4e6b86b40fe78ccb59fd1afa954d10adcb6d62dd6a7eec89a30
|
|
| MD5 |
3dd53d3343ad5a16bf94477180a261a6
|
|
| BLAKE2b-256 |
65218d5f388afdc3c5f77651df7e9946228540ff438fa02d17686117d2327050
|
Provenance
The following attestation bundles were made for hippotorch-0.4.1-py3-none-any.whl:
Publisher:
workflow.yml on domezsolt/hippotorch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hippotorch-0.4.1-py3-none-any.whl -
Subject digest:
a3c3f6d2b5f4b4e6b86b40fe78ccb59fd1afa954d10adcb6d62dd6a7eec89a30 - Sigstore transparency entry: 969543384
- Sigstore integration time:
-
Permalink:
domezsolt/hippotorch@59a541d5527b8a5863657cd13d264dd6b4b6e8b4 -
Branch / Tag:
refs/tags/v0.4.1 - Owner: https://github.com/domezsolt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@59a541d5527b8a5863657cd13d264dd6b4b6e8b4 -
Trigger Event:
release
-
Statement type: