Skip to main content

Nano: A minimal, zero-frills coding-agent for research on agent-in-the-loop training

Project description

Nano
Ask DeepWiki

A minimal coding‑agent for:

  1. agent‑in‑the‑loop reinforcement learning
  2. understanding coding agents in clear, minimal terms
  3. running neat little code fixes

What it is

Nano is a zero‑bloat wrapper that turns any tool-enabled LLM into a coding agent with two tools:


shell(cmd)  # ls, cat, grep …
apply_patch({...})  # search/replace on one file

Note: Nano uses rbash (restricted bash) to confine the agent to its designated workspace. This, along with Nano's requirement of starting in a clean git repository, helps ensure its operations remain contained and predictable.


Why it exists

Most coding agents (e.g. Aider, SWE-Agent, Devin) are designed to perform well. To achieve that, they bake in layers of effective but ad-hoc solutions:
repo maps, navigation memory, multi agent orchestration, adherence prompting, retry logic,...

These make agents more capable, but also more opaque. They're hard to analyze, and thus harder to adopt.

Nano takes the opposite stance:

Inspired by The Bitter Lesson, we believe that long-term performance comes not from encoding human intuitions, but from letting models learn their own strategies, even if they start out worse.

Effective reinforcement learning relies on a complete and unaltered log of agent interactions. Nano ensures this transparency by providing direct, non-obfuscated access to the raw reasoning, tool calls, and results, offering a precise record of what the model saw and did.

That's what Nano tries to provide.


Install

git clone git@github.com:ASSERT-KTH/nano-agent.git && cd nano-agent && pip install -e .
# or
pip install nano-agent

Then you just need an API key for your chosen provider or host them yourself with vLLM. See litellm documentation for more details.

For OpenRouter (the default provider), you'll need to set your API key as an environment variable:

export OPENROUTER_API_KEY="your-api-key-here"

You can get an API key from OpenRouter.

For other providers, set the appropriate environment variable according to LiteLLM's documentation:

  • OPENAI_API_KEY for OpenAI models
  • ANTHROPIC_API_KEY for Anthropic models
  • GEMINI_API_KEY for Google Gemini models
  • etc.

Command Line Usage

Once installed, you can use the nano_agent command to run the agent directly from the command line:

nano_agent "Fix the bug in this repository" --model openai/gpt-4o-mini

The command accepts several options:

  • task (required): Natural-language description of what the agent should do
  • --path: Repository root (defaults to current directory)
  • --model: Model identifier in LiteLLM format (default: "openrouter/qwen/qwen3-coder")
  • --api_base: Base URL for API endpoint, useful for local servers
  • --token_limit: Size of the context window in tokens (default: 32768)
  • --tool_limit: Maximum number of tool calls the agent can make (default: 50)
  • --time_limit: Maximum execution time in seconds (default: 120)
  • --response_limit: Maximum tokens per completion response (default: 4096)
  • --thinking: Emit reasoning blocks (requires compatible models)
  • --temperature: Sampling temperature (default: 0.7)
  • --top_p: Nucleus-sampling cutoff (default: 0.95)
  • --min_p: Relative floor for nucleus sampling
  • --top_k: Top-k sampling cutoff
  • --verbose: Stream tool calls as they happen
  • --no-log: Disable logging of agent activity to file

Example: rollout to Tensor

from nano import Agent
from transformers import AutoTokenizer

agent = Agent(model="openai/gpt-4.1-mini")
agent.run("There is a bug in this repo...")

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
tokens = tokenizer.apply_chat_template(
  agent.messages,
  tools=agent.tools,
  tokenize=True,
  return_format="pt"
)

Example: minimal SWE‑Gym rollout

import tempfile
from git import Repo  # git-python
from nano import Agent
from datasets import load_dataset

run = load_dataset("SWE-Gym/SWE-Gym", split="train[:1]")[0]

tempdir = tempfile.mkdtemp()
Repo.clone_from(f"https://github.com/{run['repo']}.git", tempdir)

agent = Agent(
    model="hosted_vllm/qwen/qwen3-8b",
    api_base="http://localhost:8000/v1",
)
diff = agent.run(run["problem_statement"], repo_root=tempdir)
print(diff)  # the unified diff produced by the agent
print(agent.messages, agent.tools)  # or access in `~/.nano/<timestamp>/

Use with HuggingFace TRL

Because Nano can communicate with any tool-enabled OpenAI compatible endpoint and produces token-level message logs, it works "cleanly" as a data generator inside TRL's GPROTrainer.

Note: "cleanly" refers to modifications made in our TRL fork to enable direct agent integration. These changes support the CodeRepairRL project but may not be merged into the main HuggingFace repository.

To use it:

  • Write a rollout client that wraps Agent.run()
  • Extract the diff and messages for each training example
  • Feed those into TRL's reward modeling or fine-tuning pipelines

This lets you train models that learn to use tools directly, grounded in interaction data — no custom env needed.

This approach acknowledges that the agent may initially fail in certain situations; however, these failures are valuable learning opportunities. We can then directly reinforce favorable behaviors and successful outcomes using outcome supervision, progressively refining the agent's strategies.


Citation

@misc{nano-agent2025,
  author       = {Bjarni Haukur},
  title        = {Nano: a minimalist coding agent for agent-in-the-loop training},
  howpublished = {\url{https://github.com/ASSERT-KTH/nano-agent}},
  year         = {2025}
}

🏆 Current Leaderboard

Performance on SWE-bench Lite subset, ranked by code similarity

# Ver Model Code Sim Test Sim Tokens Tools
1 v3.2.0 claude-sonnet-4-20250514 0.394 0.188 14,746 / 16,384 41.5 / 100
2 v3.2.0 gpt-4.1 0.387 0.092 9,777 / 16,384 35.7 / 100
3 v4.0.1 kimi-k2 0.382 0.009 5,508 / 16,384 19.7 / 100
4 v4.0.1 qwen3-coder 0.374 0.042 6,979 / 16,384 26.5 / 100
5 v3.2.0 gemini-2.5-pro-preview 0.370 0.034 6,008 / 16,384 13.6 / 100
6 v3.3.0 gemini-2.5-flash 0.363 0.022 4,337 / 16,384 13.2 / 100
7 v3.2.0 gemini-2.5-flash-preview-05-20 0.362 0.000 4,547 / 16,384 10.1 / 100
8 v3.2.0 gpt-4.1-mini 0.350 0.017 7,403 / 16,384 29.7 / 100
9 v3.2.0 deepseek-chat 0.336 0.011 3,297 / 16,384 7.5 / 100
10 v4.0.1 glm-4.5 0.323 0.107 12,477 / 16,384 28.7 / 100
11 v3.2.0 qwen-2.5-72b-instruct 0.272 0.000 5,873 / 16,384 35.1 / 100
12 v3.2.0 qwen3-32b 0.255 0.000 5,281 / 16,384 28.3 / 100
13 v3.2.0 llama-4-maverick 0.255 0.000 4,647 / 16,384 10.4 / 100
14 v3.2.0 qwen3-14b-thinking 0.253 0.000 8,549 / 16,384 16.3 / 100
15 v3.3.0 gemini-2.5-flash-lite-preview-06-17 0.243 0.005 6,294 / 16,384 21.6 / 100
16 v3.2.0 qwen3-32b-thinking 0.224 0.005 9,357 / 16,384 8.3 / 100
17 v3.2.0 qwen3-8b-thinking 0.210 0.000 8,688 / 16,384 15.0 / 100
18 v3.2.0 qwen3-8b 0.190 0.000 8,704 / 16,384 56.5 / 100
19 v3.2.0 gpt-4.1-nano 0.188 0.000 8,536 / 16,384 33.1 / 100
20 v3.2.0 qwen3-14b 0.176 0.000 10,800 / 16,384 82.6 / 100
21 v3.2.0 devstral-small 0.092 0.000 14,603 / 16,384 13.0 / 100

🏆 SWE-bench Verified Leaderboard

Performance on SWE-bench Verified subset, ranked by code similarity

# Ver Model Code Sim Test Sim Tokens Tools
1 v3.3.2 gemini-2.5-flash 0.353 0.000 5,956 / 16,384 12.5 / 100

How it works:

  • Input: A GitHub repository containing a bug with a known ground truth solution
  • Task: Nano provides models with tools to explore the codebase and generate a fix
  • Output: Nano produces a unified git diff containing all proposed code changes
  • Evaluation: We measure how closely the model's solution matches the ground truth using:
    • Code Similarity: How well the fix matches the actual bug fix (primary ranking metric)
    • Test Similarity: How well any test changes match the ground truth test updates

Note: Prone to a lot of noise, small test set with few repetitions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nano_agent-5.0.0.tar.gz (33.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nano_agent-5.0.0-py3-none-any.whl (34.7 kB view details)

Uploaded Python 3

File details

Details for the file nano_agent-5.0.0.tar.gz.

File metadata

  • Download URL: nano_agent-5.0.0.tar.gz
  • Upload date:
  • Size: 33.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for nano_agent-5.0.0.tar.gz
Algorithm Hash digest
SHA256 821cd88ef72e8bacf8be25fe3e146f0bfc1b1eb696b1692b8f149148942a1238
MD5 05dcadf100b3d750654cf8cca49688aa
BLAKE2b-256 897e59724e5d301fbe6400bd6ce3283b59f11036cec57efc51e45a155c29a340

See more details on using hashes here.

File details

Details for the file nano_agent-5.0.0-py3-none-any.whl.

File metadata

  • Download URL: nano_agent-5.0.0-py3-none-any.whl
  • Upload date:
  • Size: 34.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for nano_agent-5.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 52fb379441dae9a818431c95b35879f9821458165ecbed328b1fcbf1a6f6a24d
MD5 6ec508c4d9f7986287e80d775b01870d
BLAKE2b-256 169505b2e1b0bcfb968d96db16bdf2e401d0dada2a92840d4f9e071b285c61ec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page