Skip to main content

Nano: A minimal, zero-frills coding-agent for research on agent-in-the-loop training

Project description

Nano

A minimal coding‑agent for:

  1. agent‑in‑the‑loop reinforcement learning
  2. understanding coding agents in clear, minimal terms
  3. running neat little code fixes

What it is

Nano is a zero‑bloat wrapper that turns any tool-enabled LLM into a coding agent with two tools:


shell(cmd)  # ls, cat, grep …
apply_patch({...})  # search/replace on one file

Note: Nano uses rbash (restricted bash) to confine the agent to its designated workspace. This, along with Nano's requirement of starting in a clean git repository, helps ensure its operations remain contained and predictable.

Nothing else.


Why it exists

Most coding agents (e.g. Aider, SWE-Agent, Devin) are designed to perform well. To achieve that, they bake in layers of effective but ad-hoc solutions:
repo maps, navigation memory, multi agent orchestration, adherence prompting, retry logic,...

These make agents more capable, but also more opaque. They're hard to analyze, and thus harder to adopt.

Nano takes the opposite stance:

Inspired by The Bitter Lesson, we believe that long-term performance comes not from encoding human intuitions, but from letting models learn their own strategies, even if they start out worse.

Effective reinforcement learning relies on a complete and unaltered log of agent interactions. Nano ensures this transparency by providing direct, non-obfuscated access to the raw reasoning, tool calls, and results, offering a precise record of what the model saw and did.

That's what Nano tries to provide.


Install

git clone git@github.com:ASSERT-KTH/nano-agent.git && cd nano-agent && pip install -e .
# or
pip install nano-agent

Then you just need an API key for your chosen provider or host them yourself with vLLM. See litellm documentation for more details.


Example: rollout to Tensor

from nano import Agent
from transformers import AutoTokenizer

agent = Agent(model="openrouter/qwen/qwen3-8b", thinking=True)
agent.run("There is a bug in this repo...")

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
tokens = tokenizer.apply_chat_template(
  agent.messages,
  tools=agent.tools,
  tokenize=True,
  return_format="pt"
)

Example: minimal SWE‑Gym rollout

import tempfile
from git import Repo  # git-python
from nano import Agent
from datasets import load_dataset

run = load_dataset("SWE-Gym/SWE-Gym", split="train[:1]")[0]

tempdir = tempfile.mkdtemp()
Repo.clone_from(f"https://github.com/{run['repo']}.git", tempdir)

agent = Agent(
    model="hosted_vllm/qwen/qwen3-8b",
    api_base="http://localhost:8000/v1",
    thinking=True  # enables <think> ... </think> reasoning blocks
)
diff = agent.run(run["problem_statement"], repo_root=tempdir)
print(diff)  # the unified diff produced by the agent
print(agent.messages, agent.tools)  # or access in `.nano/<timestamp>/

Use with HuggingFace TRL

Because Nano can communicate with any tool-enabled OpenAI compatible endpoint and produces token-level message logs, it works "cleanly" as a data generator inside TRL's GPROTrainer.

Note: "cleanly" refers to modifications made in our TRL fork to enable direct agent integration. These changes support the CodeRepairRL project but may not be merged into the main HuggingFace repository.

To use it:

  • Write a rollout client that wraps Agent.run()
  • Extract the diff and messages for each training example
  • Feed those into TRL's reward modeling or fine-tuning pipelines

This lets you train models that learn to use tools directly, grounded in interaction data — no custom env needed.

This approach acknowledges that the agent may initially fail in certain situations; however, these failures are valuable learning opportunities. We can then directly reinforce favorable behaviors and successful outcomes using outcome supervision, progressively refining the agent's strategies.


Citation

@misc{nano-agent2025,
  author       = {Bjarni Haukur},
  title        = {Nano: a minimalist coding agent for agent-in-the-loop training},
  howpublished = {\url{https://github.com/ASSERT-KTH/nano-agent}},
  year         = {2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nano_agent-0.2.1.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nano_agent-0.2.1-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file nano_agent-0.2.1.tar.gz.

File metadata

  • Download URL: nano_agent-0.2.1.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for nano_agent-0.2.1.tar.gz
Algorithm Hash digest
SHA256 c8bfc59c1159dfe6c035d0d089e085fc9045907ea9a52d3a8d8e8fb7c8663d23
MD5 599bdf52da54d7433eb58404b333593a
BLAKE2b-256 b1ad69158bfb27fbb69b912e8d76d14f9bb13868035053ff5e961c3c7682da1a

See more details on using hashes here.

File details

Details for the file nano_agent-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: nano_agent-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for nano_agent-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a0d92adef254963fd8409503b23da57cfc2a888674588eb58bfe48a74b0ae63c
MD5 3a7ef17c50e9b1c55e9c3d55a5d13768
BLAKE2b-256 2ff7ace48b1234b349c0f363567226becbc9853c27b60a327890bacad2bcc338

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page