Nano: A minimal, zero-frills coding-agent for research on agent-in-the-loop training
Project description
Nano
A minimal coding‑agent for:
- agent‑in‑the‑loop reinforcement learning
- understanding coding agents in clear, minimal terms
- running neat little code fixes
What it is
Nano is a zero‑bloat wrapper that turns any tool-enabled LLM into a coding agent with two tools:
shell(cmd) # ls, cat, grep …
apply_patch({...}) # search/replace on one file
Note:
Nanousesrbash(restricted bash) to confine the agent to its designated workspace. This, along with Nano's requirement of starting in a clean git repository, helps ensure its operations remain contained and predictable.
Why it exists
Most coding agents (e.g. Aider, SWE-Agent, Devin) are designed to perform well. To achieve that, they bake in layers of effective but ad-hoc solutions:
repo maps, navigation memory, multi agent orchestration, adherence prompting, retry logic,...
These make agents more capable, but also more opaque. They're hard to analyze, and thus harder to adopt.
Nano takes the opposite stance:
Inspired by The Bitter Lesson, we believe that long-term performance comes not from encoding human intuitions, but from letting models learn their own strategies, even if they start out worse.
Effective reinforcement learning relies on a complete and unaltered log of agent interactions. Nano ensures this transparency by providing direct, non-obfuscated access to the raw reasoning, tool calls, and results, offering a precise record of what the model saw and did.
That's what Nano tries to provide.
Install
git clone git@github.com:ASSERT-KTH/nano-agent.git && cd nano-agent && pip install -e .
# or
pip install nano-agent
Then you just need an API key for your chosen provider or host them yourself with vLLM. See litellm documentation for more details.
Example: rollout to Tensor
from nano import Agent
from transformers import AutoTokenizer
agent = Agent(model="openai/gpt-4.1-mini")
agent.run("There is a bug in this repo...")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
tokens = tokenizer.apply_chat_template(
agent.messages,
tools=agent.tools,
tokenize=True,
return_format="pt"
)
Example: minimal SWE‑Gym rollout
import tempfile
from git import Repo # git-python
from nano import Agent
from datasets import load_dataset
run = load_dataset("SWE-Gym/SWE-Gym", split="train[:1]")[0]
tempdir = tempfile.mkdtemp()
Repo.clone_from(f"https://github.com/{run['repo']}.git", tempdir)
agent = Agent(
model="hosted_vllm/qwen/qwen3-8b",
api_base="http://localhost:8000/v1",
)
diff = agent.run(run["problem_statement"], repo_root=tempdir)
print(diff) # the unified diff produced by the agent
print(agent.messages, agent.tools) # or access in `~/.nano/<timestamp>/
Use with HuggingFace TRL
Because Nano can communicate with any tool-enabled OpenAI compatible endpoint and produces token-level message logs, it works "cleanly" as a data generator inside TRL's GPROTrainer.
Note: "cleanly" refers to modifications made in our TRL fork to enable direct agent integration. These changes support the CodeRepairRL project but may not be merged into the main HuggingFace repository.
To use it:
- Write a rollout client that wraps
Agent.run() - Extract the diff and messages for each training example
- Feed those into TRL's reward modeling or fine-tuning pipelines
This lets you train models that learn to use tools directly, grounded in interaction data — no custom env needed.
This approach acknowledges that the agent may initially fail in certain situations; however, these failures are valuable learning opportunities. We can then directly reinforce favorable behaviors and successful outcomes using outcome supervision, progressively refining the agent's strategies.
Citation
@misc{nano-agent2025,
author = {Bjarni Haukur},
title = {Nano: a minimalist coding agent for agent-in-the-loop training},
howpublished = {\url{https://github.com/ASSERT-KTH/nano-agent}},
year = {2025}
}
🏆 Current Leaderboard
Performance on SWE-bench Lite subset, ranked by code similarity
| # | Ver | Model | Code Sim | Test Sim | Tokens | Tools |
|---|---|---|---|---|---|---|
| 1 | v3.2.0 | claude-sonnet-4-20250514 | 0.394 | 0.188 | 14,746 / 16,384 | 41.5 / 100 |
| 2 | v3.2.0 | gpt-4.1 | 0.387 | 0.092 | 9,777 / 16,384 | 35.7 / 100 |
| 3 | v3.2.0 | gemini-2.5-pro-preview-thinking | 0.370 | 0.034 | 6,008 / 16,384 | 13.6 / 100 |
| 4 | v3.2.0 | gemini-2.5-flash-preview-05-20 | 0.362 | 0.000 | 4,547 / 16,384 | 10.1 / 100 |
| 5 | v3.2.0 | gpt-4.1-mini | 0.350 | 0.017 | 7,403 / 16,384 | 29.7 / 100 |
| 6 | v3.2.0 | deepseek-chat | 0.336 | 0.011 | 3,297 / 16,384 | 7.5 / 100 |
| 7 | v3.2.0 | qwen-2.5-72b-instruct | 0.272 | 0.000 | 5,873 / 16,384 | 35.1 / 100 |
| 8 | v3.2.0 | qwen3-32b | 0.255 | 0.000 | 5,281 / 16,384 | 28.3 / 100 |
| 9 | v3.2.0 | llama-4-maverick | 0.255 | 0.000 | 4,647 / 16,384 | 10.4 / 100 |
| 10 | v3.2.0 | qwen3-14b-thinking | 0.253 | 0.000 | 8,549 / 16,384 | 16.3 / 100 |
| 11 | v3.2.0 | qwen3-32b-thinking | 0.224 | 0.005 | 9,357 / 16,384 | 8.3 / 100 |
| 12 | v3.2.0 | qwen3-8b-thinking | 0.210 | 0.000 | 8,688 / 16,384 | 15.0 / 100 |
| 13 | v3.2.0 | qwen3-8b | 0.190 | 0.000 | 8,704 / 16,384 | 56.5 / 100 |
| 14 | v3.2.0 | gpt-4.1-nano | 0.188 | 0.000 | 8,536 / 16,384 | 33.1 / 100 |
| 15 | v3.2.0 | qwen3-14b | 0.176 | 0.000 | 10,800 / 16,384 | 82.6 / 100 |
| 16 | v3.2.0 | devstral-small | 0.092 | 0.000 | 14,603 / 16,384 | 13.0 / 100 |
How it works:
- Input: A GitHub repository containing a bug with a known ground truth solution
- Task: Nano provides models with tools to explore the codebase and generate a fix
- Output: Nano produces a unified git diff containing all proposed code changes
- Evaluation: We measure how closely the model's solution matches the ground truth using:
- Code Similarity: How well the fix matches the actual bug fix (primary ranking metric)
- Test Similarity: How well any test changes match the ground truth test updates
Note: Prone to a lot of noise, small test set with few repetitions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nano_agent-3.2.0.tar.gz.
File metadata
- Download URL: nano_agent-3.2.0.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fbdd03d29335f09259cb78d8aa6b7fabbc658eab7495cc9ab80b548d4090eb70
|
|
| MD5 |
d8e0497ae21afdc21c0bc9493225276f
|
|
| BLAKE2b-256 |
4554cb6ae56980c826ee1ee81e2bc3079acb027ae896ba1be65ef7e083271ac0
|
File details
Details for the file nano_agent-3.2.0-py3-none-any.whl.
File metadata
- Download URL: nano_agent-3.2.0-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0824dc869a9266b8f176b86bc3bfacc3184ae28385afb74680b4e30556e6c7b
|
|
| MD5 |
b67a55e0db28f5ebf7cfcf4cf333019e
|
|
| BLAKE2b-256 |
61e53460d85687b0f0cd53b96f1a27b5133ebb18bf6e7518891e826a02829452
|