A local-first voice AI assistant framework. Create your own JARVIS.
Project description
anyrobo
Build a voice AI assistant in 10 lines — local-first STT, LLM, TTS, and tool-calling included.
anyrobo is a batteries-included framework for building voice AI assistants that run entirely on your own hardware. It ties together speech-to-text (Whisper, Vosk), an LLM brain (Ollama by default via anyllm), and text-to-speech (pyttsx3, ElevenLabs) behind a single class. It supports multi-step tool-calling, built-in personalities (Jarvis, GLaDOS, assistant), a plugin system for custom skills, an event bus, a RAG knowledge base, and MCP client support for calling external tool servers.
Built by Viet-Anh Nguyen at NRL.ai.
Why anyrobo?
- One-liner API —
anyrobo.Robo().listen()is a complete voice assistant - Plugin architecture — Add custom skills, tools, personalities, and backends
- Local-first — Whisper + Ollama + pyttsx3 run 100% offline
- Minimal core deps — Base install is light; STT/TTS/LLM backends are extras
- Production-ready — Event system, memory persistence, MCP client, RAG
Installation
pip install anyrobo
For backends:
pip install anyrobo[whisper] # openai-whisper for local STT
pip install anyrobo[vosk] # Vosk for offline STT
pip install anyrobo[tts] # pyttsx3 for offline TTS
pip install anyrobo[elevenlabs] # ElevenLabs cloud TTS
pip install anyrobo[llm] # anyllm for LLM routing
pip install anyrobo[rag] # knowledge base with embeddings
pip install anyrobo[mcp] # MCP client for external tool servers
pip install anyrobo[all] # everything
Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)
Quick Start
import anyrobo
# 1. Simplest possible voice assistant (Whisper + Ollama + pyttsx3)
bot = anyrobo.Robo(personality="jarvis")
bot.listen() # starts mic, transcribes, replies with voice, loops
# 2. Add a tool the assistant can call (schema auto-extracted from type hints)
def set_timer(minutes: int, label: str = "timer") -> str:
"""Set a countdown timer for the given number of minutes."""
return f"Timer '{label}' set for {minutes} minutes."
bot.add_tool(set_timer)
bot.listen() # now: "Hey Jarvis, set a 5 minute pasta timer" -> tool call
# 3. Text-mode (no mic/speaker, useful for testing)
reply = bot.ask("What's the weather like in Tokyo?")
Models & Methods
Backends (all local-first)
| Component | Backend | Model | Install |
|---|---|---|---|
| STT | Whisper | openai-whisper tiny/base/small/medium/large |
anyrobo[whisper] |
| STT | Vosk | Vosk offline models | anyrobo[vosk] |
| LLM | Ollama (default) | Any Ollama model (llama3.1:8b, qwen2.5, ...) |
anyrobo[llm] |
| LLM | OpenAI / Anthropic | via anyllm | anyrobo[llm] |
| TTS | pyttsx3 | OS-native voices (SAPI / NSSpeechSynthesizer / espeak) | anyrobo[tts] |
| TTS | ElevenLabs | Cloud API | anyrobo[elevenlabs] |
Tool / function calling
Pass plain Python functions; anyrobo (via anyllm) auto-extracts parameter schemas from type hints and docstrings, then runs a multi-step agentic loop:
- LLM receives the user query + tool list
- LLM decides whether to call a tool (structured output)
anyrobodispatches the tool and feeds the result back- Loop until the LLM emits a final natural-language response
Built-in personalities
| Name | Style |
|---|---|
jarvis |
Polite British butler, concise and proactive |
glados |
Dry, sarcastic, vaguely threatening (Portal-inspired) |
assistant |
Neutral, helpful default |
custom |
Pass your own system_prompt |
Conversation memory
SlidingWindowMemory keeps the last N turns in context, with optional disk persistence (JSON). Robo.save_memory(path) / load_memory(path) for persistence across sessions.
Event system
Subscribe to any lifecycle event:
bot.on("user_message", lambda text: print("heard:", text))
bot.on("tool_call", lambda name, args: log_tool(name, args))
bot.on("response", lambda text: print("bot:", text))
RAG Knowledge Base
anyrobo.KnowledgeBase() ingests text/PDF/markdown, chunks via anynlp, embeds via anyllm.embed, and performs similarity search to augment the LLM prompt.
MCP client
Robo.add_mcp_server(command, args) connects to any Model Context Protocol server (filesystem, GitHub, web search, a model exposed via anydeploy.mcp, ...) and exposes its tools to the assistant automatically.
Plugin system
Subclass anyrobo.Skill to package reusable behavior:
class WeatherSkill(anyrobo.Skill):
name = "weather"
def tools(self):
return [self.get_weather]
def get_weather(self, city: str) -> dict: ...
bot.add_skill(WeatherSkill())
API Reference
| Function / class | Purpose |
|---|---|
anyrobo.Robo(personality, stt, tts, llm) |
Main assistant class |
Robo.listen(hotword=None) |
Voice loop: STT -> LLM -> TTS |
Robo.ask(text) |
Text-mode interaction |
Robo.add_tool(fn) |
Register a Python function as a tool |
Robo.add_skill(skill) |
Register a plugin skill |
Robo.add_mcp_server(cmd, args) |
Connect to an MCP server |
Robo.on(event, handler) |
Subscribe to lifecycle events |
anyrobo.KnowledgeBase() |
RAG knowledge base |
anyrobo.Skill |
Base class for plugins |
CLI Usage
anyrobo listen --personality jarvis --model llama3.1:8b
anyrobo ask "What's on my calendar today?"
anyrobo list-personalities
anyrobo list-voices
Examples
Voice assistant with custom tools
import anyrobo
def turn_on_lights(room: str) -> str:
"""Turn on the smart lights in a specific room."""
return f"Lights in {room} are now on."
def play_music(genre: str, volume: int = 50) -> str:
"""Play music of a given genre at the specified volume (0-100)."""
return f"Playing {genre} music at volume {volume}."
bot = anyrobo.Robo(personality="jarvis", model="llama3.1:8b")
bot.add_tool(turn_on_lights)
bot.add_tool(play_music)
bot.listen()
RAG-powered Q&A over your docs
import anyrobo
kb = anyrobo.KnowledgeBase()
kb.ingest("docs/") # chunks + embeds every markdown/pdf
bot = anyrobo.Robo(knowledge_base=kb, personality="assistant")
print(bot.ask("What's our refund policy?"))
Connect to external MCP servers
import anyrobo
bot = anyrobo.Robo(personality="jarvis")
bot.add_mcp_server("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
bot.listen() # now the LLM can read/write files via MCP
License
MIT (c) Viet-Anh Nguyen
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file anyrobo-0.2.3.tar.gz.
File metadata
- Download URL: anyrobo-0.2.3.tar.gz
- Upload date:
- Size: 27.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45e13fac031c4648f26bb98e6b3737ce91d06290f27ce6d602711084f6e7593f
|
|
| MD5 |
bdbf7457730de49707abd01d1d96cc9c
|
|
| BLAKE2b-256 |
facbbc1f998225d912c8ca4d1a164867f8c58e7d98a151061e05c1bf57e710e2
|
File details
Details for the file anyrobo-0.2.3-py3-none-any.whl.
File metadata
- Download URL: anyrobo-0.2.3-py3-none-any.whl
- Upload date:
- Size: 29.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b7689d123eeae84fdcaf272a0e4f65d46514cd1efa4f7d728531e9246e71b9f
|
|
| MD5 |
16b0c98c8571024d6289704855f443f8
|
|
| BLAKE2b-256 |
32adc1960edfbab678c178dcdaec34bb4c7494974bc97c1e9f004bc99c4bff7a
|