Skip to main content

A local-first voice AI assistant framework. Create your own JARVIS.

Project description

anyrobo

Build a voice AI assistant in 10 lines — local-first STT, LLM, TTS, and tool-calling included.

PyPI Python License

anyrobo is a batteries-included framework for building voice AI assistants that run entirely on your own hardware. It ties together speech-to-text (Whisper, Vosk), an LLM brain (Ollama by default via anyllm), and text-to-speech (pyttsx3, ElevenLabs) behind a single class. It supports multi-step tool-calling, built-in personalities (Jarvis, GLaDOS, assistant), a plugin system for custom skills, an event bus, a RAG knowledge base, and MCP client support for calling external tool servers.

Built by Viet-Anh Nguyen at NRL.ai.

Why anyrobo?

  • One-liner APIanyrobo.Robo().listen() is a complete voice assistant
  • Plugin architecture — Add custom skills, tools, personalities, and backends
  • Local-first — Whisper + Ollama + pyttsx3 run 100% offline
  • Minimal core deps — Base install is light; STT/TTS/LLM backends are extras
  • Production-ready — Event system, memory persistence, MCP client, RAG

Installation

pip install anyrobo

For backends:

pip install anyrobo[whisper]      # openai-whisper for local STT
pip install anyrobo[vosk]         # Vosk for offline STT
pip install anyrobo[tts]          # pyttsx3 for offline TTS
pip install anyrobo[elevenlabs]   # ElevenLabs cloud TTS
pip install anyrobo[llm]          # anyllm for LLM routing
pip install anyrobo[rag]          # knowledge base with embeddings
pip install anyrobo[mcp]          # MCP client for external tool servers
pip install anyrobo[all]          # everything

Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

Quick Start

import anyrobo

# 1. Simplest possible voice assistant (Whisper + Ollama + pyttsx3)
bot = anyrobo.Robo(personality="jarvis")
bot.listen()    # starts mic, transcribes, replies with voice, loops

# 2. Add a tool the assistant can call (schema auto-extracted from type hints)
def set_timer(minutes: int, label: str = "timer") -> str:
    """Set a countdown timer for the given number of minutes."""
    return f"Timer '{label}' set for {minutes} minutes."

bot.add_tool(set_timer)
bot.listen()   # now: "Hey Jarvis, set a 5 minute pasta timer" -> tool call

# 3. Text-mode (no mic/speaker, useful for testing)
reply = bot.ask("What's the weather like in Tokyo?")

Models & Methods

Backends (all local-first)

Component Backend Model Install
STT Whisper openai-whisper tiny/base/small/medium/large anyrobo[whisper]
STT Vosk Vosk offline models anyrobo[vosk]
LLM Ollama (default) Any Ollama model (llama3.1:8b, qwen2.5, ...) anyrobo[llm]
LLM OpenAI / Anthropic via anyllm anyrobo[llm]
TTS pyttsx3 OS-native voices (SAPI / NSSpeechSynthesizer / espeak) anyrobo[tts]
TTS ElevenLabs Cloud API anyrobo[elevenlabs]

Tool / function calling

Pass plain Python functions; anyrobo (via anyllm) auto-extracts parameter schemas from type hints and docstrings, then runs a multi-step agentic loop:

  1. LLM receives the user query + tool list
  2. LLM decides whether to call a tool (structured output)
  3. anyrobo dispatches the tool and feeds the result back
  4. Loop until the LLM emits a final natural-language response

Built-in personalities

Name Style
jarvis Polite British butler, concise and proactive
glados Dry, sarcastic, vaguely threatening (Portal-inspired)
assistant Neutral, helpful default
custom Pass your own system_prompt

Conversation memory

SlidingWindowMemory keeps the last N turns in context, with optional disk persistence (JSON). Robo.save_memory(path) / load_memory(path) for persistence across sessions.

Event system

Subscribe to any lifecycle event:

bot.on("user_message", lambda text: print("heard:", text))
bot.on("tool_call", lambda name, args: log_tool(name, args))
bot.on("response", lambda text: print("bot:", text))

RAG Knowledge Base

anyrobo.KnowledgeBase() ingests text/PDF/markdown, chunks via anynlp, embeds via anyllm.embed, and performs similarity search to augment the LLM prompt.

MCP client

Robo.add_mcp_server(command, args) connects to any Model Context Protocol server (filesystem, GitHub, web search, a model exposed via anydeploy.mcp, ...) and exposes its tools to the assistant automatically.

Plugin system

Subclass anyrobo.Skill to package reusable behavior:

class WeatherSkill(anyrobo.Skill):
    name = "weather"
    def tools(self):
        return [self.get_weather]
    def get_weather(self, city: str) -> dict: ...

bot.add_skill(WeatherSkill())

API Reference

Function / class Purpose
anyrobo.Robo(personality, stt, tts, llm) Main assistant class
Robo.listen(hotword=None) Voice loop: STT -> LLM -> TTS
Robo.ask(text) Text-mode interaction
Robo.add_tool(fn) Register a Python function as a tool
Robo.add_skill(skill) Register a plugin skill
Robo.add_mcp_server(cmd, args) Connect to an MCP server
Robo.on(event, handler) Subscribe to lifecycle events
anyrobo.KnowledgeBase() RAG knowledge base
anyrobo.Skill Base class for plugins

CLI Usage

anyrobo listen --personality jarvis --model llama3.1:8b
anyrobo ask "What's on my calendar today?"
anyrobo list-personalities
anyrobo list-voices

Examples

Voice assistant with custom tools

import anyrobo

def turn_on_lights(room: str) -> str:
    """Turn on the smart lights in a specific room."""
    return f"Lights in {room} are now on."

def play_music(genre: str, volume: int = 50) -> str:
    """Play music of a given genre at the specified volume (0-100)."""
    return f"Playing {genre} music at volume {volume}."

bot = anyrobo.Robo(personality="jarvis", model="llama3.1:8b")
bot.add_tool(turn_on_lights)
bot.add_tool(play_music)
bot.listen()

RAG-powered Q&A over your docs

import anyrobo

kb = anyrobo.KnowledgeBase()
kb.ingest("docs/")              # chunks + embeds every markdown/pdf
bot = anyrobo.Robo(knowledge_base=kb, personality="assistant")
print(bot.ask("What's our refund policy?"))

Connect to external MCP servers

import anyrobo

bot = anyrobo.Robo(personality="jarvis")
bot.add_mcp_server("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
bot.listen()   # now the LLM can read/write files via MCP

License

MIT (c) Viet-Anh Nguyen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyrobo-0.2.3.tar.gz (27.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anyrobo-0.2.3-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file anyrobo-0.2.3.tar.gz.

File metadata

  • Download URL: anyrobo-0.2.3.tar.gz
  • Upload date:
  • Size: 27.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyrobo-0.2.3.tar.gz
Algorithm Hash digest
SHA256 45e13fac031c4648f26bb98e6b3737ce91d06290f27ce6d602711084f6e7593f
MD5 bdbf7457730de49707abd01d1d96cc9c
BLAKE2b-256 facbbc1f998225d912c8ca4d1a164867f8c58e7d98a151061e05c1bf57e710e2

See more details on using hashes here.

File details

Details for the file anyrobo-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: anyrobo-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 29.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyrobo-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1b7689d123eeae84fdcaf272a0e4f65d46514cd1efa4f7d728531e9246e71b9f
MD5 16b0c98c8571024d6289704855f443f8
BLAKE2b-256 32adc1960edfbab678c178dcdaec34bb4c7494974bc97c1e9f004bc99c4bff7a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page