Skip to main content

VoxKage - OS-level Agentic AI Assistant. Autonomous, Persistent, Aware.

Project description

VoxKage Agentic OS Assistant


VoxKage

The Living Agentic OS Framework

Utilizing the Gemini CLI interface to power an untethered, system-wide AI brain.


PyPI Version Engine Capabilities Zero Cost

Install Efficiency Telegram License




VoxKage is a massive evolution beyond standard coding assistants. It is a living Agentic OS Framework designed to break the AI out of its IDE prison. By hijacking the Gemini CLI to use as its conversational frontend, VoxKage deploys a complex "honeycomb" of intertwined MCP capabilities to gain real-time, autonomous, and untethered access to the whole internet, your file system, and your operating system.

[View Architecture] โ€ข [Explore Capabilities] โ€ข [Get Started] โ€ข [Update / Upgrade]




๐Ÿง  The Innovation:

The Imprisonment Limitation:-

Modern AI CLIs (like Claude Code, Cursor, or the base Gemini CLI) are incredibly powerful text generators, but they suffer from a fundamental limitation: Imprisonment. They are strictly confined to the directory they are launched in.

If you ask a normal CLI assistant to "Diagnose why my locally hosted web app isn't rendering properly, cross-reference the frontend CSS with the network tab, download the correct backend dependency from the official site, and install it", they fail. They don't have eyes, they can't orchestrate multi-domain research, and they can't interact with your operating system on a holistic level.

The Challenge: How do we transform a static text-generation tool into a proactive, self-healing, system-wide orchestrator without compromising security or relying on paid API credits?


The "VoxKage" OS Evolution:-

VoxKage solves this by treating the official Gemini CLI merely as a "mouthpiece" for its own highly complex brain. VoxKage is not a wrapperโ€”it is an independent entity that mounts 18 specialized Model Context Protocol (MCP) servers into the runtime state, creating an interwoven web of tools.

VoxKage doesn't just "plug and play" a web search tool. It utilizes its honeycomb architecture to combine tools autonomously: it spins up a Playwright browser, takes a screenshot of a broken webpage, extracts the DOM computed styles, uses semantic web search to find a solution, writes a step-by-step repair plan, and executes it via the native OS shellโ€”all in one fluid, self-correcting thought loop.


The Architectural Breakdown:-

graph TD
    subgraph "VoxKage Agentic Brain (Honeycomb Architecture)"
        A((VoxKage Core Directive)) --> B{Agentic Reasoning Loop}

        subgraph "Self-Healing & Orchestration"
            B <--> C(ACE: Dynamic Planning)
            B <--> D(DOM Verification & GUI Thinking)
            B <--> E(Autonomous Research & Fallback)
        end
    end

    subgraph "The Interconnected MCP Web"
        C <--> |AST Skeletons| F[Codebase Index & RAG]
        D <--> |Screenshot/Compute CSS| G[Playwright DOM Engine]
        E <--> |Cross-Reference| H[Web Search & Download Automation]
        C <--> |Native Commands| I[OS Shell & FileOps]
    end

    subgraph "External Control Layers"
        B <--> J[Telegram Remote Bridge]
        B <--> K[API Plugins: Gmail/Spotify/GitHub]
    end

    subgraph "Interface Hook"
        L[Official Gemini CLI] --> |Frontend IO| A
    end

    style A fill:#0ea5e9,stroke:#fff,stroke-width:2px,color:#fff
    style B fill:#8b5cf6,stroke:#fff,stroke-width:2px,color:#fff
    style C fill:#10b981,stroke:#fff,stroke-width:2px,color:#fff
    style D fill:#f59e0b,stroke:#fff,stroke-width:2px,color:#fff

VoxKage operates using a deeply interconnected web of capabilities. Here is how the brain actually works:

โš™๏ธ 1. ACE Coding Engine & Autonomous Self-Correction

VoxKage forces a strict 5-phase developer pipeline (The Agentic Coding Engine). It does not guess.

  • RAG Awareness: Indexes the codebase into a vector store before typing.
  • Planning: Generates a persistent active_plan.md step-by-step checklist.
  • AST Skeletons: Extracts 40-line structural metadata from 2000-line files, creating 95% token efficiency.
  • Self-Healing Verification: Runs compilation or DOM checks after editing. If a step fails, VoxKage automatically flags it as "failed", researches the error, fixes it, and updates the plan.

๐ŸŒ 2. GUI Thinking & Deep Web Automation

VoxKage uses the entire internet as its playground. It spins up an invisible Playwright browser to:

  • Take visual screenshots and perform OCR verification.
  • Extract computed CSS to debug animations and layouts.
  • Automatically navigate official software pages, find the correct .exe for your OS, verify it, and execute the installation.

๐ŸŒ‰ 3. The Omnipresent Bridges

You can walk away from your PC and text your VoxKage Telegram bot. Ask it: "Hey, my CI/CD pipeline failed on GitHub. Find the error log, write a patch locally on my PC, test it, and push the fix." VoxKage coordinates the Telegram API, GitHub API, local Git shell, and ACE engine to do it while you're grabbing coffee.


โœจ Core Capabilities & Engineering Specs:-

๐Ÿ“ˆ The VoxKage Advantage vs Industry Standards

Metric Standard AI IDEs (Cursor/Cline) VoxKage Framework
Execution Scope Imprisoned (Single Project)
Token Efficiency Reads full files (High Burn Rate)
Operating Cost (OPEX) $20/mo + API Usage Costs
Model Amplification Depends on strictly paid models
Web & GUI Logic Text Scraping / No Visuals
Remote Access Requires physical PC access
Installation Complex repo cloning + setup

[!TIP] Model Amplification: Because VoxKage enforces structured "Agent Thinking Loops" and reduces context payloads using AST Skeletons, it allows free-tier models (like gemini-3-flash-preview or gemini-2.5-flash-lite) to execute tasks with the accuracy and reliability typically reserved for heavy, expensive Pro models.


๐Ÿ› ๏ธ Getting Started: Install in 60 Seconds

VoxKage is a globally installable Python package. No cloning, no virtual environments, no setup scripts.

Prerequisites

Before installing VoxKage, ensure the following are on your system:

Requirement Minimum Version Check command
Python 3.10+ python --version
pipx any pipx --version
Gemini CLI any gemini --version
Node.js 18+ node --version

Install pipx if you don't have it:

pip install pipx
pipx ensurepath

Install Gemini CLI (the AI frontend VoxKage hijacks):

npm install -g @google/gemini-cli
gemini   # Run once to authenticate with your Google account

Step 1: Install VoxKage

pipx install voxkage

That's it. VoxKage is now globally available as the voxkage command from any directory on your machine. The core install is around ~80 MB and takes under a minute on a decent connection.


Step 2: Run the Setup Wizard

voxkage init

The wizard will:

  • Create your ~/.voxkage data directory (stores memory, credentials, config)
  • Scaffold your .env secrets file for Telegram, Spotify, GitHub, Gmail
  • Inject the VoxKage personality directives into your Gemini CLI settings
  • Register all 18 MCP servers into Gemini CLI's settings.json
  • Prompt you to install optional capability packs

Expected output:

  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚  โœฆ  VoxKage v1.1.0 โ€” First-Time Setup                      โ”‚
  โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”‚
  โ”‚  VoxKage supercharges your Gemini CLI into a living OS AI. โ”‚
  โ”‚  This takes about 2 minutes.                               โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

  โœ“  Platform: Windows
  โœ“  Data directory: C:\Users\YourName\.voxkage
  โœ“  MCP servers registered: 18
  โœ“  Gemini CLI settings patched

Step 3: Install Capability Packs (Optional but Recommended)

The core VoxKage is immediately powerful. Heavy ML packs are opt-in to keep the base install fast. Install them anytime using:

voxkage install <pack>
Pack What it unlocks Size
browser Playwright web automation, DOM inspection, screenshot analysis, PDF reading ~80 MB pkg + ~150 MB Chromium
rag ChromaDB semantic memory, full codebase indexing, document RAG ~500 MB
vision OpenCV + RapidOCR for screen reading and image analysis ~250 MB
docs_plus Word/PDF/Excel format conversion and document intelligence ~80 MB
full Everything above in one command ~910 MB

Install the browser engine (highly recommended โ€” powers web search and automation):

voxkage install browser

Install everything at once:

voxkage install full

Step 4: Configure Your Integrations

Edit your secrets file to connect VoxKage to external services:

# Open the secrets file (Windows)
notepad C:\Users\YourName\.voxkage\.env

# macOS / Linux
nano ~/.voxkage/.env
# โ”€โ”€ Telegram Remote Control โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
TELEGRAM_BOT_TOKEN=your_token_from_@BotFather
TELEGRAM_CHAT_ID=your_personal_chat_id

# โ”€โ”€ Spotify Music Control โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
SPOTIFY_CLIENT_ID=your_client_id
SPOTIFY_CLIENT_SECRET=your_client_secret
SPOTIFY_REDIRECT_URI=http://localhost:8888/callback

# โ”€โ”€ GitHub Integration โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
GITHUB_PAT=your_personal_access_token

# โ”€โ”€ Gmail (uses OAuth โ€” run voxkage plugins add gmail)
# No token needed here โ€” handled by OAuth flow

Check your connection status at any time:

voxkage status
  SYSTEM HEALTH
    โœ“  VoxKage Core       v1.1.0
    โœ“  MCP Servers        18 registered

  CAPABILITY PACKS
    โœ“  Core AI + OS Control       (always on)
    โœ“  RAG Memory                 installed
    โœ“  Vision & OCR               installed
    โœ—  Browser Engine             voxkage install browser
    โœ“  PDF Conversion             installed

  INTEGRATIONS
    โœ“  Telegram           Connected
    โœ—  Spotify            Add SPOTIFY_CLIENT_ID to .env
    โœ“  GitHub             Connected
    โœ“  Gmail              Connected

Step 5: Wake Up VoxKage

voxkage

You are now inside a fully agentic OS session. VoxKage is running with all 18 MCP tools mounted and ready.


Step 6 (Optional): System Tray + Telegram Remote Mode

Launch the persistent background daemon that puts VoxKage in your system tray and starts listening for Telegram messages:

voxkage tray

From this point, you can close the terminal. VoxKage is alive in the background. Text it from your phone via Telegram to command your PC remotely from anywhere in the world.


Directory Structure

After initialization, VoxKage creates this layout:

C:\Users\YourName\.voxkage\           # Core data directory
โ”œโ”€โ”€ .gemini\
โ”‚   โ”œโ”€โ”€ GEMINI.md                     # VoxKage personality & tool awareness directives
โ”‚   โ””โ”€โ”€ settings.json                 # All 18 MCP server registrations
โ”œโ”€โ”€ data\                             # Credentials, Gmail OAuth tokens
โ”œโ”€โ”€ rag\                              # ChromaDB vector store (if RAG installed)
โ”œโ”€โ”€ logs\                             # Session traces and health logs
โ”œโ”€โ”€ .env                              # Your integration secrets
โ””โ”€โ”€ config.json                       # Model selection and agentic loop config

๐Ÿ”„ Updating & Upgrading VoxKage

Standard Upgrade (Recommended)

To update VoxKage to the latest release from PyPI:

pipx upgrade voxkage

Check your installed version vs the latest:

voxkage --version
pip index versions voxkage   # Lists all available versions

If pipx upgrade Fails or Gets Stuck

This can happen if a previous VoxKage process (tray, watcher) is still running and has locked the Python executable. Follow this sequence:

Step 1 โ€” Kill any running VoxKage processes:

# Windows PowerShell
Get-Process -Name "pythonw","python" -ErrorAction SilentlyContinue | `
  Where-Object { $_.Path -like "*pipx*voxkage*" } | `
  Stop-Process -Force
Start-Sleep -Seconds 2

Step 2 โ€” Force reinstall the latest version:

pipx install voxkage --force

Step 3 โ€” If permission errors still appear (e.g., [Errno 13] Permission denied):

# Remove the broken venv and reinstall cleanly
pipx uninstall voxkage
pipx install voxkage

Step 4 โ€” Verify the upgrade worked:

voxkage --version

Pinning to a Specific Version

If you need to test or rollback to a specific version:

pipx install voxkage==1.1.0 --force

Upgrading Optional Packs After a VoxKage Upgrade

Optional capability packs (RAG, Vision, Browser) are injected into VoxKage's isolated pipx venv. After upgrading VoxKage itself, re-inject them if any are missing:

# Re-inject individual packs (using exact packages from pyproject.toml)
pipx inject voxkage playwright PyMuPDF             # browser
pipx inject voxkage chromadb sentence-transformers numpy pyarrow  # rag
pipx inject voxkage opencv-python rapidocr-onnxruntime            # vision
pipx inject voxkage docx2pdf pdf2docx              # docs_plus

# After injecting the browser pack, also install the Chromium binary:
pipx run --spec voxkage playwright install chromium
# Or simply:
voxkage install browser   # the CLI handles the playwright install chromium step automatically

# Or install all packs in one shot via the VoxKage CLI
voxkage install full

Completely Uninstalling VoxKage

# Remove the package
pipx uninstall voxkage

# Optionally remove all stored data, memory, and configs
# Windows:
Remove-Item -Recurse -Force "$env:USERPROFILE\.voxkage"
# macOS / Linux:
rm -rf ~/.voxkage

๐Ÿ”Œ Command Reference

Command Description
voxkage Start a VoxKage agentic session
voxkage init Run the first-time setup wizard (safe to re-run)
voxkage status Check system health, pack status, and integration connections
voxkage tray Launch the background system tray daemon + Telegram watcher
voxkage install <pack> Install an optional capability pack (rag, browser, vision, docs_plus, full)
voxkage plugins List all registered plugins and their connection state
voxkage plugins add <name> Configure a plugin interactively (telegram, spotify, github, gmail)
voxkage --version Print the installed version
voxkage --help Show all available commands

๐Ÿ—บ๏ธ Roadmap & Future Evolutions

  • Shipped: pipx install voxkage โ€” single-command global installation
  • Shipped: Native tkinter Settings Dashboard (zero extra deps, instant-open from tray)
  • Shipped: Core-First lean install (~80 MB) with optional heavy packs
  • Shipped: Telegram Remote Control โ€” command your OS from your phone
  • Shipped: voxkage init intelligence โ€” detects already-installed packs, skips redundant prompts
  • In Progress: Finalizing the [project.entry-points."voxkage.plugins"] API to allow the community to publish custom plugins (e.g., Jira, AWS, Docker orchestrators) via PyPI that VoxKage automatically detects and mounts into its honeycomb.
  • Planned: macOS and Linux System Tray parity.
  • Planned: VoxKage Cloud Sync โ€” encrypted cross-device memory persistence.

๐Ÿค Contributing

VoxKage is an open-source initiative designed to push the boundaries of local AI orchestration. If you want to contribute a new MCP server or refine the ACE logic:

  1. Fork the repository.
  2. Create your feature branch (git checkout -b feature/AdvancedRAG).
  3. Commit your changes (git commit -m 'Implement advanced semantic search').
  4. Push to the branch (git push origin feature/AdvancedRAG).
  5. Open a Pull Request.





"I am ready, sir."
โ€” VoxKage

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxkage-1.1.2-py3-none-any.whl (354.3 kB view details)

Uploaded Python 3

File details

Details for the file voxkage-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: voxkage-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 354.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for voxkage-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 97ade86a611b74a8acfebdc1ef222543fcde8dedc937325805cce99460082064
MD5 78b53bad1dc6354ab867ae5b0c001e85
BLAKE2b-256 2f5e7da56a6322f62498c78c9f421b73687590dbb8cfc8a2568219fe8cdbfeeb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page