Terminal-first control plane for NVIDIA Nemotron 3 — agentic coding, RAG, doc-ops, and multi-model formations.
Project description
NeMoCode
Agentic coding CLI for NVIDIA NIM. Reads your code, makes edits, runs commands — powered by any model on the NIM API or your own GPU via vLLM, SGLang, or TensorRT-LLM.
Community project — not affiliated with or endorsed by NVIDIA.
Install
From PyPI:
pip install nemocode
Recommended (isolated environment):
pipx install nemocode
Or from source (editable):
pip install -e .
Quick Start
Run the guided setup wizard:
nemo setup
The wizard defaults to hosted NVIDIA NIM, prompts for NVIDIA_API_KEY, and can also configure a local vllm, sglang, or trt-llm backend for you.
Hosted NVIDIA NIM (default)
Get a free API key from build.nvidia.com:
export NVIDIA_API_KEY="nvapi-..."
nemo code
Hosted Nemotron endpoints use NVIDIA_API_KEY by default. The setup wizard can store it in your system keyring.
Local vLLM, SGLang, or TensorRT-LLM
Serve a model locally on any NVIDIA GPU:
# vLLM
vllm serve nvidia/NVIDIA-Nemotron-Nano-9B-v2 \
--host 0.0.0.0 --port 8000
nemo code -e local-vllm-nano9b
# SGLang (best for Nemotron 3 Super long context on DGX Spark)
python -m sglang.launch_server \
--model nvidia/nemotron-3-super-120b-a12b \
--host 0.0.0.0 --port 8000
nemo code -e local-sglang-super
# TensorRT-LLM
docker run --rm -it --gpus all --ipc host --network host \
-e HF_TOKEN=$HF_TOKEN \
-v $HOME/.cache/huggingface/:/root/.cache/huggingface/ \
nvcr.io/nvidia/tensorrt-llm/release:1.3.0rc6 \
trtllm-serve nvidia/NVIDIA-Nemotron-3-Nano-4B-FP8 \
--trust_remote_code --port 8000
nemo code -e local-trt-llm-nano4b
TensorRT-LLM launch flags can vary by image release and whether a model needs a prebuilt
engine. The bundled NeMoCode presets assume the OpenAI-compatible trtllm-serve path with
nvidia/nemotron-3-super-120b-a12b and nvidia/NVIDIA-Nemotron-3-Nano-4B-FP8.
No GPU? Rent one via Brev:
nemo setup brev
Usage
nemo code # interactive REPL
nemo code "fix the bug in auth.py" -y # one-shot, auto-approve tools
nemo chat "explain this error" # chat, no tools
cat log.txt | nemo code "diagnose" # pipe input
nemo code -f super-nano "refactor" # multi-model formation
nemo code --tui # full-screen TUI
Plan Mode
Plan mode is a read-only planning phase with an approval gate before execution.
- Read-only: Plan mode only reads files, searches code, and explores — no writes, shell commands, or commits.
- Approval gate: The planner proposes a concrete plan. You review and approve, revise with feedback, or cancel.
- Execution: Once approved, a build agent executes the plan with full tool access.
Switch modes in the REPL with Tab or /mode:
| Mode | Behavior |
|---|---|
code |
Ask before tool calls (default) |
plan |
Read-only planning + approval gate |
auto |
Auto-approve everything |
Launch directly in plan mode:
nemo code --agent plan "implement user auth"
The plan agent can also spawn read-only research subagents to help with exploration.
Endpoints
Works with any OpenAI-compatible API. Pre-configured:
| Endpoint | Model | Access |
|---|---|---|
nim-super |
Nemotron 3 Super (12B/120B MoE, 1M ctx) | NIM API key |
nim-nano |
Nemotron 3 Nano (3B/30B MoE, 1M ctx) | NIM API key |
nim-nano-9b |
Nemotron Nano 9B v2 | NIM API key |
nim-nano-4b |
Nemotron Nano 4B v1.1 | NIM API key |
nim-vlm |
Nemotron Nano 12B VL (vision) | NIM API key |
nim-embed |
Nemotron Embed 1B v2 | NIM API key |
nim-rerank |
Nemotron Rerank 1B v2 | NIM API key |
openrouter-super |
Super via OpenRouter | OpenRouter key |
together-super |
Super via Together AI | Together key |
local-trt-llm-super |
Nemotron 3 Super 120B via TensorRT-LLM | GPU + Docker + TensorRT-LLM |
local-trt-llm-nano4b |
Nemotron 3 Nano 4B FP8 via TensorRT-LLM | GPU + Docker + TensorRT-LLM |
local-vllm-* |
Any model on local vLLM | GPU + vLLM |
local-sglang-* |
Any model on local SGLang | GPU + SGLang |
local-nim-* |
Local NIM container | GPU + Docker |
Formations
Multi-model pipelines — Super plans, Nano executes, Super reviews:
nemo code -f super-nano "implement caching"
| Formation | Pipeline |
|---|---|
solo |
Super does everything (default) |
super-nano |
Super plans + reviews, Nano executes |
spark |
All-local on DGX Spark (Super + Nano 9B) |
spark-sglang |
Super via SGLang on Spark (best long context) |
spark-trt-llm |
Super 120B + Nano 4B via TensorRT-LLM on Spark |
vision |
VLM reads screenshots, Super writes code |
local |
Nano on local GPU, no internet needed |
Agents & Sub-agent Orchestration
NeMoCode supports named agent profiles for top-level sessions and delegated sub-agents.
- Primary agents:
build(default full-access),plan(read-only planning) - Sub-agents:
general,explore,review,debug,test,doc,code-search,fast - Inspect them with
nemo agent lsandnemo agent show <name> - Switch primary agents with
nemo code --agent <name>or/agent <name>in the REPL/TUI
Sub-agent tools
In coding sessions, these orchestration tools are available:
| Tool | Purpose |
|---|---|
delegate |
Spawn a sub-agent and wait for the result |
spawn_agent |
Spawn a background sub-agent for parallel work |
wait_agent |
Wait for a spawned sub-agent to finish |
close_agent |
Close or cancel a sub-agent handle |
resume_agent |
Reopen a previously closed sub-agent handle |
Sub-agents inherit read-only mode when delegated from plan mode.
Custom agents
Define custom agents in .nemocode.yaml under agents: or as markdown files under .nemocode/agents/*.md:
---
description: Review code for bugs and regressions
mode: subagent
role: reviewer
prefer_tiers:
- super
tools:
- fs_read
- git_read
- rg
---
Review the requested changes. Focus on correctness, regressions, and missing tests.
Setup Commands
nemo setup # guided wizard
nemo setup --list # show all setup topics
nemo setup wizard # force the interactive wizard
nemo setup trt-llm # TensorRT-LLM serving guide
nemo setup vllm # vLLM serving guide
nemo setup sglang # SGLang serving guide
nemo setup nim # NIM container guide
nemo setup brev # rent a cloud GPU
More Commands
nemo endpoint ls / test # manage endpoints
nemo model ls / show # inspect model manifests
nemo formation ls / show # inspect formations
nemo agent ls / show # inspect agent profiles
nemo hardware recommend # GPU-based recommendations
nemo doctor # run diagnostics to check setup
nemo session ls # past conversations
nemo obs pricing # token pricing
nemo init # create .nemocode.yaml without overriding user defaults
Contributing
See CONTRIBUTING.md for setup, code style, and PR guidelines.
pip install -e ".[dev]"
ruff check src/ tests/ && ruff format --check src/ tests/
pytest tests/ -v
License
MIT. NVIDIA, Nemotron, and NIM are trademarks of NVIDIA Corporation.
Architecture
User Input (CLI / TUI)
|
CodeAgent (orchestrator)
| agent profiles, project context, git state, memories
v
Scheduler (formation pipeline driver)
| single-model or plan/execute/review formation
| stagnation detection, auto-compaction, permission engine
v
Registry (endpoint / manifest / formation resolver)
|
v
Providers (NIM Chat, Embeddings, Rerank)
| OpenAI-compatible API, SSE streaming, retry with backoff
v
ToolRegistry (18+ tools)
- fs: read, write, edit, multi_edit
- git: status, diff, log, commit, snapshot
- search: rg, glob
- bash: shell command execution
- agent: delegate, spawn, wait, close, resume
- memory: save, recall
- web, parse, test, ask_user, clarify, LSP, MCP
Comparison
| Feature | NeMoCode | Claude Code | Cursor | OpenCode |
|---|---|---|---|---|
| Open source | MIT | No | No | MIT |
| Terminal-first | Yes | Yes | IDE | Yes |
| Multi-model formations | Yes | No | No | No |
| Local GPU serving | vLLM, SGLang, TRT-LLM | No | No | No |
| Hardware detection | Yes (GPU/RAM/Spark) | No | No | No |
| 1M token context | Yes (Nemotron 3) | 200K | 128K | Varies |
| Sub-agent orchestration | Yes | No | No | Yes |
| LSP integration | Yes | No | Yes | Yes |
| Vision (screenshots) | Yes (VLM) | Yes | Yes | Yes |
| Plugin system | Yes | No | No | Yes |
| NVIDIA NIM native | Yes | No | No | No |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nemocode-0.1.25.tar.gz.
File metadata
- Download URL: nemocode-0.1.25.tar.gz
- Upload date:
- Size: 274.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6f3aa0e975613f2e5702bc7f278904fe38550d378addbb2d05c681af676dfe02
|
|
| MD5 |
f06ac9f19c4207a0c6bf67937d82bce9
|
|
| BLAKE2b-256 |
aee32d7183c5161c02c8441e7cbd6eaffac54b8f45d2f19470b899be06550f9b
|
Provenance
The following attestation bundles were made for nemocode-0.1.25.tar.gz:
Publisher:
publish.yml on Hmbown/NeMoCode
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nemocode-0.1.25.tar.gz -
Subject digest:
6f3aa0e975613f2e5702bc7f278904fe38550d378addbb2d05c681af676dfe02 - Sigstore transparency entry: 1155140054
- Sigstore integration time:
-
Permalink:
Hmbown/NeMoCode@c92623471a814cc209120a93462717cc13393d53 -
Branch / Tag:
refs/tags/v0.1.25 - Owner: https://github.com/Hmbown
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c92623471a814cc209120a93462717cc13393d53 -
Trigger Event:
push
-
Statement type:
File details
Details for the file nemocode-0.1.25-py3-none-any.whl.
File metadata
- Download URL: nemocode-0.1.25-py3-none-any.whl
- Upload date:
- Size: 209.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc9374145f1da0869e3d49d12bb1e4dc274b1addd9c2c649703343a740e55397
|
|
| MD5 |
6509d72325480c022627814fb620906b
|
|
| BLAKE2b-256 |
f5cc5c9a7ee14e20e39dfd83d8ddb6e3ec258153936f7618a87d1ef70bc1425d
|
Provenance
The following attestation bundles were made for nemocode-0.1.25-py3-none-any.whl:
Publisher:
publish.yml on Hmbown/NeMoCode
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nemocode-0.1.25-py3-none-any.whl -
Subject digest:
fc9374145f1da0869e3d49d12bb1e4dc274b1addd9c2c649703343a740e55397 - Sigstore transparency entry: 1155140058
- Sigstore integration time:
-
Permalink:
Hmbown/NeMoCode@c92623471a814cc209120a93462717cc13393d53 -
Branch / Tag:
refs/tags/v0.1.25 - Owner: https://github.com/Hmbown
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c92623471a814cc209120a93462717cc13393d53 -
Trigger Event:
push
-
Statement type: