Skip to main content

A local-first voice AI assistant framework. Create your own JARVIS.

Project description

anyrobo

Build a voice AI assistant in 10 lines — local-first STT, LLM, TTS, and tool-calling included.

PyPI Python License

anyrobo is a batteries-included framework for building voice AI assistants that run entirely on your own hardware. It ties together speech-to-text (Whisper, Vosk), an LLM brain (Ollama by default via anyllm), and text-to-speech (pyttsx3, ElevenLabs) behind a single class. It supports multi-step tool-calling, built-in personalities (Jarvis, GLaDOS, assistant), a plugin system for custom skills, an event bus, a RAG knowledge base, and MCP client support for calling external tool servers.

Built by Viet-Anh Nguyen at NRL.ai.

Why anyrobo?

  • One-liner APIanyrobo.Robo().listen() is a complete voice assistant
  • Plugin architecture — Add custom skills, tools, personalities, and backends
  • Local-first — Whisper + Ollama + pyttsx3 run 100% offline
  • Minimal core deps — Base install is light; STT/TTS/LLM backends are extras
  • Production-ready — Event system, memory persistence, MCP client, RAG

Installation

pip install anyrobo

For backends:

pip install anyrobo[whisper]      # openai-whisper for local STT
pip install anyrobo[vosk]         # Vosk for offline STT
pip install anyrobo[tts]          # pyttsx3 for offline TTS
pip install anyrobo[elevenlabs]   # ElevenLabs cloud TTS
pip install anyrobo[llm]          # anyllm for LLM routing
pip install anyrobo[rag]          # knowledge base with embeddings
pip install anyrobo[mcp]          # MCP client for external tool servers
pip install anyrobo[all]          # everything

Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

Quick Start

import anyrobo

# 1. Simplest possible voice assistant (Whisper + Ollama + pyttsx3)
bot = anyrobo.Robo(personality="jarvis")
bot.listen()    # starts mic, transcribes, replies with voice, loops

# 2. Add a tool the assistant can call (schema auto-extracted from type hints)
def set_timer(minutes: int, label: str = "timer") -> str:
    """Set a countdown timer for the given number of minutes."""
    return f"Timer '{label}' set for {minutes} minutes."

bot.add_tool(set_timer)
bot.listen()   # now: "Hey Jarvis, set a 5 minute pasta timer" -> tool call

# 3. Text-mode (no mic/speaker, useful for testing)
reply = bot.ask("What's the weather like in Tokyo?")

Models & Methods

Backends (all local-first)

Component Backend Model Install
STT Whisper openai-whisper tiny/base/small/medium/large anyrobo[whisper]
STT Vosk Vosk offline models anyrobo[vosk]
LLM Ollama (default) Any Ollama model (llama3.1:8b, qwen2.5, ...) anyrobo[llm]
LLM OpenAI / Anthropic via anyllm anyrobo[llm]
TTS pyttsx3 OS-native voices (SAPI / NSSpeechSynthesizer / espeak) anyrobo[tts]
TTS ElevenLabs Cloud API anyrobo[elevenlabs]

Tool / function calling

Pass plain Python functions; anyrobo (via anyllm) auto-extracts parameter schemas from type hints and docstrings, then runs a multi-step agentic loop:

  1. LLM receives the user query + tool list
  2. LLM decides whether to call a tool (structured output)
  3. anyrobo dispatches the tool and feeds the result back
  4. Loop until the LLM emits a final natural-language response

Built-in personalities

Name Style
jarvis Polite British butler, concise and proactive
glados Dry, sarcastic, vaguely threatening (Portal-inspired)
assistant Neutral, helpful default
custom Pass your own system_prompt

Conversation memory

SlidingWindowMemory keeps the last N turns in context, with optional disk persistence (JSON). Robo.save_memory(path) / load_memory(path) for persistence across sessions.

Event system

Subscribe to any lifecycle event:

bot.on("user_message", lambda text: print("heard:", text))
bot.on("tool_call", lambda name, args: log_tool(name, args))
bot.on("response", lambda text: print("bot:", text))

RAG Knowledge Base

anyrobo.KnowledgeBase() ingests text/PDF/markdown, chunks via anynlp, embeds via anyllm.embed, and performs similarity search to augment the LLM prompt.

MCP client

Robo.add_mcp_server(command, args) connects to any Model Context Protocol server (filesystem, GitHub, web search, a model exposed via anydeploy.mcp, ...) and exposes its tools to the assistant automatically.

Plugin system

Subclass anyrobo.Skill to package reusable behavior:

class WeatherSkill(anyrobo.Skill):
    name = "weather"
    def tools(self):
        return [self.get_weather]
    def get_weather(self, city: str) -> dict: ...

bot.add_skill(WeatherSkill())

API Reference

Function / class Purpose
anyrobo.Robo(personality, stt, tts, llm) Main assistant class
Robo.listen(hotword=None) Voice loop: STT -> LLM -> TTS
Robo.ask(text) Text-mode interaction
Robo.add_tool(fn) Register a Python function as a tool
Robo.add_skill(skill) Register a plugin skill
Robo.add_mcp_server(cmd, args) Connect to an MCP server
Robo.on(event, handler) Subscribe to lifecycle events
anyrobo.KnowledgeBase() RAG knowledge base
anyrobo.Skill Base class for plugins

CLI Usage

anyrobo listen --personality jarvis --model llama3.1:8b
anyrobo ask "What's on my calendar today?"
anyrobo list-personalities
anyrobo list-voices

Examples

Voice assistant with custom tools

import anyrobo

def turn_on_lights(room: str) -> str:
    """Turn on the smart lights in a specific room."""
    return f"Lights in {room} are now on."

def play_music(genre: str, volume: int = 50) -> str:
    """Play music of a given genre at the specified volume (0-100)."""
    return f"Playing {genre} music at volume {volume}."

bot = anyrobo.Robo(personality="jarvis", model="llama3.1:8b")
bot.add_tool(turn_on_lights)
bot.add_tool(play_music)
bot.listen()

RAG-powered Q&A over your docs

import anyrobo

kb = anyrobo.KnowledgeBase()
kb.ingest("docs/")              # chunks + embeds every markdown/pdf
bot = anyrobo.Robo(knowledge_base=kb, personality="assistant")
print(bot.ask("What's our refund policy?"))

Connect to external MCP servers

import anyrobo

bot = anyrobo.Robo(personality="jarvis")
bot.add_mcp_server("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
bot.listen()   # now the LLM can read/write files via MCP

License

MIT (c) Viet-Anh Nguyen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyrobo-0.2.4.tar.gz (27.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anyrobo-0.2.4-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file anyrobo-0.2.4.tar.gz.

File metadata

  • Download URL: anyrobo-0.2.4.tar.gz
  • Upload date:
  • Size: 27.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyrobo-0.2.4.tar.gz
Algorithm Hash digest
SHA256 6e3020f84b6d37f19842b1d61dad386746bccd3e2950de6d5b6da26c53020973
MD5 e16e74cfa8cd70b66b76d647e5c8cfe4
BLAKE2b-256 5191415d69e0fbaecb870f955befa259b402b130fc2977d884c11fe213bf412e

See more details on using hashes here.

File details

Details for the file anyrobo-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: anyrobo-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 29.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyrobo-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3b6aa844e7ebd98c2c5645b2025fa39ee339af611d7b466913cc70c64ed3c7eb
MD5 5cba2d72b9afc1291d80378297cb1b98
BLAKE2b-256 f3fa4b80fb81cd0c37097a2d2985c322cb201b1f3e66255538e3a065d787335c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page