Multi-model coding agent powered by NVIDIA Llama (3B, 70B) with CLI, web search, and tool-use loop
Project description
LLaMACode
A multi-model coding agent powered by NVIDIA Llama that runs in your terminal. Reads, writes, and edits files, runs shell commands, searches code, browses the web, and more — all through a natural-language chat interface.
Features
- Two model tiers — Llama 3.2 3B (balanced, default) and Llama 3.3 70B (most powerful)
- Autonomous tool loop — the agent plans, searches, edits, and verifies code on its own
- File operations — read, write, edit files, run shell commands, glob, grep
- Web search — search the internet and fetch page content (no API key needed)
- Session persistence — save/load conversations, undo exchanges, compact context
- Multi-agent architecture — planner, searcher, coder, reviewer, summarizer roles
- Animated spinner — shows which phase/agent is working in real time
Installation
pip install llamacode
Quick Start
You need an NVIDIA API key to use Llama models. The easiest way to get one:
llamacode --generate-key
This opens build.nvidia.com in a browser. Select the model (3B or 70B), log in to your NVIDIA account, and the key is saved automatically.
Then just run:
llamacode
You'll see an interactive model picker:
═══════════════════════════════════════════════════════════════
LLaMACode - Choose a Model
═══════════════════════════════════════════════════════════════
[1] Llama 3.2 3B Balanced, default model Key: ✓
[2] Llama 3.3 70B Most powerful (needs your own key) Key: ✗
Select [1-2] (Enter to keep current):
API Keys
Every user must provide their own NVIDIA API key. There is no bundled key.
Option 1 — Generate via browser (easiest)
llamacode --generate-key
Prompts you to pick 3B or 70B, opens build.nvidia.com, detects and saves the key automatically. Requires Playwright (installed on first use).
Option 2 — .env file
Create a .env file in one of these locations:
| Location | Path |
|---|---|
| Home directory (recommended) | %USERPROFILE%\.coding_agent\.env |
| Current working directory | .env |
Contents:
NVIDIA_API_KEY_LLAMA_3_2_3B=nvapi-your-3b-key-here
NVIDIA_API_KEY_LLAMA_3_3_70B=nvapi-your-70b-key-here # optional
Option 3 — Environment variable
set NVIDIA_API_KEY_LLAMA_3_2_3B=nvapi-your-key # cmd
$env:NVIDIA_API_KEY_LLAMA_3_2_3B="nvapi-your-key" # PowerShell
export NVIDIA_API_KEY_LLAMA_3_2_3B=nvapi-your-key # bash
llamacode checks these env var names (in order):
| Model | Env vars checked |
|---|---|
| 3B | NVIDIA_API_KEY_LLAMA_3_2_3B, NVIDIA_LLAMA_3_2_3B_API_KEY, NVIDIA_API_KEY_3B |
| 70B | NVIDIA_API_KEY_LLAMA_3_3_70B, NVIDIA_LLAMA_3_3_70B_API_KEY, NVIDIA_API_KEY_70B, NVIDIA_API_KEY |
Usage
llamacode # interactive CLI with model picker
llamacode --model llama-3.2-3b # skip picker, use 3B
llamacode --model llama-3.3-70b # skip picker, use 70B
llamacode --generate-key # generate API key via browser
CLI Commands
| Command | Description |
|---|---|
/exit |
Exit the CLI |
/new |
Start a new session |
/clear |
Clear conversation history |
/status |
Show session info, model, workdir, key status |
/model |
Open interactive model picker |
/model <name> |
Switch model (llama-3.2-3b, llama-3.3-70b) |
/workdir |
Show working directory |
/workdir <path> |
Change working directory |
/index |
Build .agent/project_index.json |
/save |
Save conversation to timestamped file |
/save <file> |
Save conversation to a specific file |
/load <file> |
Load a saved conversation |
/undo |
Remove the last exchange |
/compact |
Summarize conversation to save context |
/help |
Show help |
Switching Models at Runtime
/model llama-3.2-3b → Llama 3.2 3B (balanced, default)
/model llama-3.3-70b → Llama 3.3 70B (most powerful)
The 70B model switches immediately only if you have an NVIDIA API key configured (see API Keys).
Available Models
| Alias | Model ID | Size | Key Required |
|---|---|---|---|
llama-3.2-3b |
meta/llama-3.2-3b-instruct |
3B | User-provided |
llama-3.3-70b |
meta/llama-3.3-70b-instruct |
70B | User-provided |
- 3B — handles coding, reviewing, and summarizing (balanced, default).
- 70B — most powerful; best for complex reasoning tasks. Requires your own NVIDIA API key.
Agent Capabilities
LLaMACode has a built-in tool loop that lets the agent autonomously work through tasks:
| Tool | Description |
|---|---|
read_file |
Read a file chunk by line range |
write_file |
Write content to a file (creates directories if needed) |
edit_file |
Replace specific text in an existing file |
bash |
Run a shell command with timeout |
glob |
Find files matching a pattern |
grep |
Search file contents and indexed symbols |
web_search |
Search the internet via DuckDuckGo |
web_fetch |
Fetch and extract text from a URL |
think |
Internal reasoning (invisible to the user) |
Project Structure
llamacode/
├── coding_agent.py Main CLI entry point
├── key_generator.py Browser-based API key generation
├── api_server.py Optional FastAPI server
├── pyproject.toml Package config
├── MANIFEST.in Build exclusions
├── agents/ Multi-agent wrappers
│ ├── planner_agent.py
│ ├── search_agent.py
│ ├── coder_agent.py
│ ├── reviewer_agent.py
│ └── summary_agent.py
└── core/ Core modules
├── model_manager.py Model & key resolution
├── context_manager.py Message trimming
├── file_manager.py Read/write/edit files
├── project_index.py Symbol indexing
└── summary_cache.py Summary caching
Development
# Clone the repo
git clone https://github.com/anomalyco/third_party_connect.git
cd third_party_connect
# Editable install
pip install -e .
# Set up API keys
echo NVIDIA_API_KEY_LLAMA_3_2_3B=nvapi-your-key > .env
# Run
llamacode
Publishing to PyPI
Push a version tag to trigger the automated workflow:
git tag v1.1.0
git push origin v1.1.0
The GitHub Actions workflow (.github/workflows/publish.yml) builds, checks, and publishes to PyPI using trusted publishing (OIDC).
Manual publish:
pip install build twine
python -m build
python -m twine upload dist/*
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llamacode-1.1.0.tar.gz.
File metadata
- Download URL: llamacode-1.1.0.tar.gz
- Upload date:
- Size: 23.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc2b88892041f4067035d2769bcb9b07a5653b061fb43b22209dc26a90378e68
|
|
| MD5 |
afa1b80fe728ed9f1dab6e329772ce01
|
|
| BLAKE2b-256 |
ac5f558bb6556abec12db89b4502a816d96a0b9e6e23e072a7e5d08ac0ab52ad
|
Provenance
The following attestation bundles were made for llamacode-1.1.0.tar.gz:
Publisher:
publish.yml on NandanRavi/LLama-Coding-Agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llamacode-1.1.0.tar.gz -
Subject digest:
bc2b88892041f4067035d2769bcb9b07a5653b061fb43b22209dc26a90378e68 - Sigstore transparency entry: 1764930550
- Sigstore integration time:
-
Permalink:
NandanRavi/LLama-Coding-Agent@848b1720e1c1756cc0408164f6c067ec5f6afa76 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/NandanRavi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@848b1720e1c1756cc0408164f6c067ec5f6afa76 -
Trigger Event:
push
-
Statement type:
File details
Details for the file llamacode-1.1.0-py3-none-any.whl.
File metadata
- Download URL: llamacode-1.1.0-py3-none-any.whl
- Upload date:
- Size: 25.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e263b5582b28eb0939616f7fea0a5eb1a0b1d2d33a6a9102fcc7c4bcdc5db0b1
|
|
| MD5 |
014c0abfb4f2f67a57a3fd2b1ac2bd5f
|
|
| BLAKE2b-256 |
38c949098af0b7c0a71f7058f5d0084e27a4f99e7c00aa8552bbda383476f7a0
|
Provenance
The following attestation bundles were made for llamacode-1.1.0-py3-none-any.whl:
Publisher:
publish.yml on NandanRavi/LLama-Coding-Agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llamacode-1.1.0-py3-none-any.whl -
Subject digest:
e263b5582b28eb0939616f7fea0a5eb1a0b1d2d33a6a9102fcc7c4bcdc5db0b1 - Sigstore transparency entry: 1764930693
- Sigstore integration time:
-
Permalink:
NandanRavi/LLama-Coding-Agent@848b1720e1c1756cc0408164f6c067ec5f6afa76 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/NandanRavi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@848b1720e1c1756cc0408164f6c067ec5f6afa76 -
Trigger Event:
push
-
Statement type: