Skip to main content

A beautiful, agentic CLI for Ollama — run local LLMs with auto tool-calling, memory, and more

Project description

ollama-agentic

A beautiful, agentic terminal interface for Ollama — run local LLMs with auto tool-calling, long-term memory, git integration, concurrent subagents, and semantic code search.

Python License PyPI


⚠️ Requirement: Ollama must be installed first

This CLI is a frontend for Ollama. It will not work without Ollama installed and running on your machine.

  1. Download and install Ollama from ollama.com/download
  2. Start it: ollama serve (or open the Ollama desktop app)
  3. Pull a model: ollama pull mistral or ollama pull llama3.1:8b

Then install and launch this CLI:

pip install ollama-agentic
ollama-cli

Features

  • Auto mode — model autonomously calls tools to complete tasks (/auto)
  • 🐝 Swarm agents/swarm splits complex tasks across parallel background agents
  • 🔍 Semantic code search (RAG) — AST-aware local codebase indexing, no API needed
  • 🌿 Git integration/git status, diff, log, commit (with AI messages), branch, stash
  • 🔁 Iterative debug loop/run file.py auto-fixes errors until code passes
  • 📋 Plan executor/plan <goal> breaks goals into typed steps and executes them
  • 🧠 Long-term memory/remember stores facts that persist across sessions
  • ⬇️ Arrow-key model picker/install lets you browse and download 25+ models
  • 🔧 Agent tools/shell, /file, /fetch, /ls inject real context into chats
  • 💾 Conversation saving/save and /load persist chats as JSON
  • 🎭 Personas — save and load system prompt presets
  • 🆚 Compare mode — run the same prompt through two models side by side

Usage

ollama-cli                       # start chatting
ollama-cli --model qwen2.5:7b    # start with a specific model
ollama-cli --auto                # start in autonomous agent mode
ollama-cli --compare             # compare two models side by side

Commands

Chat & Navigation

Command Description
/cls Clear screen (keep context)
/clear Clear conversation and screen
Ctrl+L Clear screen
/retry Regenerate last response
/tokens Toggle token count display

Models

Command Description
/model Switch active model (arrow-key picker)
/current Show currently active model
/install Browse & install models from catalogue
/models List all installed models
/compare Compare two models side by side

Agentic

Command Description
/auto Toggle autonomous tool-calling mode
/plan <goal> Break a goal into steps and execute
/run <file.py> Run code, auto-fix errors in a loop
/swarm <task> Decompose task across parallel background agents
/swarm-status Check swarm progress
/swarm-status full See full output from each agent

Git

Command Description
/git Show git status
/git diff Show unstaged diff, inject into context
/git diff staged Show staged diff
/git log Recent commits with timestamps
/git branch List branches
/git branch <n> Switch branch
/git commit Stage and commit (AI message option)
/git stash Stash changes

RAG — Semantic Code Search

Command Description
/rag Show index status
/rag index Incremental index of project
/rag index full Wipe and rebuild index
/rag search <query> Semantic search over codebase
/rag auto Toggle auto-inject relevant chunks into every chat
/rag clear Wipe the index

Memory

Command Description
/remember <fact> Store a fact in long-term memory
/memories List all stored memories
/forget <id> Delete a memory by ID

Context Injection

Command Description
/file <path> Load a file into context
/shell <cmd> Run a shell command, inject output
/fetch <url> Fetch a webpage into context
/ls <path> Inject a directory listing
/context View or clear active injections

Conversations & Personas

Command Description
/save <n> Save conversation
/load <n> Load conversation
/list List saved conversations
/system <prompt> Set a system prompt
/persona <n> Load a saved persona
/personas List saved personas
/save-persona <n> Save current system prompt as persona

Swarm Agents

/swarm decomposes a complex task into independent subtasks and runs them as parallel agents in the background. You keep using the CLI while they work.

you › /swarm research React Server Components vs traditional SSR
you › /swarm-status          # check mid-task
you › /swarm-status full     # read each agent's full output

RAG — Semantic Code Search

Run from inside any git repo. Uses AST-aware chunking for Python and sliding-window chunking for all other languages. Embeddings run fully offline via sentence-transformers.

RAG dependencies are optional — the CLI works fine without them:

pip install lancedb sentence-transformers tree-sitter tree-sitter-python
you › /rag index             # index your project (~seconds)
you › /rag search auth flow  # semantic search
you › /rag auto              # auto-inject relevant chunks into every chat

The index lives in .ollama_rag/ inside your project. Only changed files are re-indexed on subsequent runs.


Agent Mode

Toggle with /auto or launch with --auto. The model calls tools, reads results, and loops until the task is done.

⚡ you › look at main.py and find any bugs
⚡ you › write a web scraper for hacker news and run it
⚡ you › set up a basic Flask app in this folder

Config & Data

Path Description
~/.ollama_cli_config.json Settings (model, auto mode, etc)
~/.ollama_cli_history Input history
~/.ollama_cli_memory.json Long-term memories
~/.ollama_cli_saves/ Saved conversations
~/.ollama_cli_personas/ Saved personas
.ollama_rag/ RAG vector index (per project, inside project root)

Requirements


Roadmap

  • Project memory — /understand deep-reads your codebase and stores structured knowledge
  • MCP server support — connect to filesystem, GitHub, Postgres, browser tools
  • TUI dashboard — split-pane interface with live swarm agent view
  • API key integrations — Claude, OpenAI, Gemini, Groq as model backends

Contributing

PRs and issues welcome at github.com/Akhil123454321/ollama-cli. Keep changes focused and include tests where appropriate.

License

MIT — see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_agentic-1.0.3.tar.gz (35.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ollama_agentic-1.0.3-py3-none-any.whl (36.0 kB view details)

Uploaded Python 3

File details

Details for the file ollama_agentic-1.0.3.tar.gz.

File metadata

  • Download URL: ollama_agentic-1.0.3.tar.gz
  • Upload date:
  • Size: 35.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for ollama_agentic-1.0.3.tar.gz
Algorithm Hash digest
SHA256 73a7e6ccfea0dd110a29adfde45732ad9a97cfd06a488d0ed46f92861bf0076f
MD5 3ffb42e83d3588923b6354cb6c66ec01
BLAKE2b-256 ad7af8f84c43ad568b65dc23e28129622c73709210a5d4988405bd504322f233

See more details on using hashes here.

File details

Details for the file ollama_agentic-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: ollama_agentic-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 36.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for ollama_agentic-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ebaabf3966fd505c33af3028f20b7cad211e7699211cd58630b5e5014f4800f0
MD5 16af2af6f78e86700452e540facc2ac4
BLAKE2b-256 148bbc4b6626468237c2b0dc5601d48907170ddd0c68c3da02e2a9e89f53ca91

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page