Multi-language REPL engines for DSPy Recursive Language Models (RLM) with trajectory tracing and benchmark tooling
Project description
DSPy-REPL
Modular non-Python REPL engines for DSPy Recursive Language Models.
dspy-repl is a modular package for non-Python REPL-based RLM engines compatible with DSPy, inspired by the Recursive Language Models paper.
Scope
- Keeps Python
dspy.RLMinside DSPy as the canonical Python implementation. - Provides modular engines for:
SchemeRLMSQLRLMHaskellRLMJavaScriptRLM
- Exposes extension points for adding new REPL languages.
Install
pip install dspy-repl
For local development:
pip install -e ".[dev]"
Quick usage
import dspy
from dspy_repl import SchemeRLM, SQLRLM, HaskellRLM, JavaScriptRLM
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))
scheme_rlm = SchemeRLM("context, query -> answer")
result = scheme_rlm(context="...", query="...")
print(result.answer)
print(result.trajectory) # step-by-step REPL history
js_rlm = JavaScriptRLM("context, query -> answer")
js_result = js_rlm(context="...", query="...")
print(js_result.answer)
Observability and debugging
dspy-repl is designed to expose what happened inside an RLM run:
result.trajectorycontains the full iterative REPL trace.- Each trajectory step includes:
reasoning: model reasoning for that stepcode: code sent to the language REPLoutput: interpreter output/error text
SQLRLMadditionally exposeslast_sql_profiletiming breakdowns after each run.
Enable verbose engine logs:
scheme_rlm = SchemeRLM("context, query -> answer", verbose=True)
With verbose=True, each iteration is logged with reasoning/code/output previews, which is useful for prompt/tool/debug loops.
What happens inside an RLM
At a high level, each RLM run follows this loop:
- Build REPL variable metadata from inputs.
- Generate next action (reasoning + code) from the LM.
- Execute code in the target REPL (Scheme/Haskell/SQL/JavaScript).
- Append
{reasoning, code, output}to trajectory. - Repeat until final output is submitted or max iterations is reached.
- If max iterations is reached, run fallback extraction from accumulated trajectory.
This loop is shared in dspy_repl.core.base_rlm and specialized by language-specific wrappers.
Architecture
dspy_repl.core: shared execution loop and shared tool plumbingdspy_repl.languages: language-specific prompt templates and wrappersdspy_repl.interpreters: interpreter adapter exportsdspy_repl.compat: thin compatibility shims for DSPy touchpoints
DSPy compatibility
dspy>=3.0.0.
Runtime prerequisites
SQLRLM: no external runtime (uses Pythonsqlite3)SchemeRLM: requiresguileHaskellRLM: requiresghci(GHC)JavaScriptRLM: requiresnode
Install REPL runtimes
If you want to run all REPL-based engines and benchmark comparisons (including Python dspy.RLM), install:
- Python REPL engine in benchmarks (
dspy.RLM):deno - Scheme REPL engine (
SchemeRLM):guile - Haskell REPL engine (
HaskellRLM):ghcifrom GHC - JavaScript REPL engine (
JavaScriptRLM):node
macOS (Homebrew):
brew install deno guile ghc node
Ubuntu/Debian:
sudo apt-get update
sudo apt-get install -y deno guile-3.0 ghc nodejs npm
Verify tools are available:
deno --version
guile --version
ghci --version
node --version
Python package dependencies for benchmarks
For Oolong benchmarks, you also need:
dspy-repl(this package)dspydatasets(Hugging Face datasets loader used by Oolong adapter)
Example:
pip install -e ".[dev]" datasets
Benchmarking (Oolong dataset)
The repository includes an Oolong benchmark runner with artifact saving and trajectory diagnostics.
Run benchmarks:
python -m dspy_repl.benchmarks.oolong_runner --model "gemini/gemini-3-flash-preview" --languages "python,scheme,sql,haskell"
Run OOLONG-Pairs benchmarks:
python -m dspy_repl.benchmarks.oolong_pairs_runner --model "gemini/gemini-3-flash-preview" --languages "sql,scheme,js" --max-samples 20
Run S-NIAH synthetic scaling benchmarks:
python -m dspy_repl.benchmarks.niah_runner --languages "python,sql,scheme" --num-tasks 50 --context-lengths "8192,32768,131072"
Generate a single HTML analytics report (tables + Plotly charts + insights):
python -m dspy_repl.benchmarks.report_runner --run-dir benchmark_results/<run_id>
Compare several runs in one report:
python -m dspy_repl.benchmarks.report_runner --run-dirs benchmark_results/<id1>,benchmark_results/<id2>
Multiprocessing
By default, selected languages run in parallel per sample using multiprocessing.
- Enable explicitly:
--parallel - Disable:
--no-parallel - Cap processes:
--max-workers 2
Example:
python -m dspy_repl.benchmarks.oolong_runner --languages "scheme,sql,haskell" --max-workers 2
Useful benchmark flags
--max-samples 20--sample-id <id>--engine-timeout-seconds 240--verbose--save-dir benchmark_results--config ./benchmark.json
Where results are saved
Each run creates a timestamped directory under save_dir with:
benchmark.log: structured lifecycle logsrun_config.json: effective run configincremental_results.jsonl: live per-sample writes (if enabled)results.jsonl: per-sample records with trajectory diagnosticssummary.jsonandby_engine.csv: aggregate metricstrajectory_stats.jsonandper_engine_trajectory_stats.jsontrajectories/<engine>/<sample_id>.json: full trajectories
To inspect one execution deeply, start with a trajectory file and then correlate with the same sample in results.jsonl and benchmark.log.
Full benchmark usage guide: BENCHMARKS.md.
Local validation before release
python -m build
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest -q
python -m twine check --strict dist/*
Backlog
- Add shared context with PostgreSQL/MySQL.
- Test shared context in a multi-agent environment.
- Extend benchmarks with additional long-context suites.
- Optimize REPL instructions with GEPA.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dspy_repl-0.3.0.tar.gz.
File metadata
- Download URL: dspy_repl-0.3.0.tar.gz
- Upload date:
- Size: 50.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
937cfdda4367e22fdcc9a3b907b494b40c7935fda51fe3e7220115638eea98ca
|
|
| MD5 |
5f1854cf59b274c5b8c588ebb082528a
|
|
| BLAKE2b-256 |
6043181c2e6a102e2e0efc45d013c83c2db8fb7f51c4e3d7b759e9a11eb842b2
|
Provenance
The following attestation bundles were made for dspy_repl-0.3.0.tar.gz:
Publisher:
publish.yml on Archelunch/dspy-repl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dspy_repl-0.3.0.tar.gz -
Subject digest:
937cfdda4367e22fdcc9a3b907b494b40c7935fda51fe3e7220115638eea98ca - Sigstore transparency entry: 962513917
- Sigstore integration time:
-
Permalink:
Archelunch/dspy-repl@a369c491a53e2130206a477a1896f1560b53bebd -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/Archelunch
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a369c491a53e2130206a477a1896f1560b53bebd -
Trigger Event:
push
-
Statement type:
File details
Details for the file dspy_repl-0.3.0-py3-none-any.whl.
File metadata
- Download URL: dspy_repl-0.3.0-py3-none-any.whl
- Upload date:
- Size: 72.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63faf9fe92dca8384dad45b21c45b67764b541fbd5c841b97489198812a6013f
|
|
| MD5 |
ef21ebd6ecc14b36a264d73b544016f1
|
|
| BLAKE2b-256 |
29b03f7852cbafd23a3ac3842873361503580e8e387efbb5bbb15ca94e757c70
|
Provenance
The following attestation bundles were made for dspy_repl-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on Archelunch/dspy-repl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dspy_repl-0.3.0-py3-none-any.whl -
Subject digest:
63faf9fe92dca8384dad45b21c45b67764b541fbd5c841b97489198812a6013f - Sigstore transparency entry: 962513919
- Sigstore integration time:
-
Permalink:
Archelunch/dspy-repl@a369c491a53e2130206a477a1896f1560b53bebd -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/Archelunch
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a369c491a53e2130206a477a1896f1560b53bebd -
Trigger Event:
push
-
Statement type: