Skip to main content

NVHive — Multi-LLM orchestration platform with intelligent routing, hive consensus, and auto-agent generation

Project description

nvHive

One command. Every AI model you have. Automatically assembled into the best team for each task.

version python license ci

nvh "What is a binary search tree?"              # → answers (single best advisor)
nvh "Fix the timeout bug in council.py"          # → auto-detects coding task → agent mode
nvh "Should we use Redis or Postgres?"           # → auto-detects debate → council (3+ advisors)
nvh "take a screenshot and describe my desktop"  # → desktop agent (vision + tools)
nvh "setup comfyui"                              # → agent installs, configures, launches

nvHive CLI


Install

Three ways to get nvHive — pick the one that matches your setup. No Docker, no container runtime, no root required.

Option 1 — One-line installer (recommended for GPU VMs)

curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/install.sh | bash

Works on any Linux box with no root. Installs to ~/nvh/, auto-detects conda/mamba/venv, pulls Ollama if you have an NVIDIA GPU, and writes a sensible default config. Re-running heals the venv if the host Python moved (common on fresh cloud VMs).

Windows: iwr -useb https://raw.githubusercontent.com/thatcooperguy/nvHive/main/install.ps1 | iex macOS: curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/install-mac.sh | bash

Option 2 — Single-file binary (no Python needed)

Fully standalone. No Python install, no pip, no venv. Click your OS:

Download for Linux (x86_64)   Download for macOS (Apple Silicon)   Download for Windows (x86_64)

On Linux/macOS after download: chmod +x nvh-* && ./nvh-*. Full asset list (wheel, sdist, checksums) lives on the Releases page.

Option 3 — pip from PyPI (for existing Python environments)

pip install nvhive              # core
pip install "nvhive[vision]"    # + desktop agent (screenshot, click, type)
pip install "nvhive[browser]"   # + headless browser (playwright)
pip install "nvhive[all]"       # everything

First run

nvh                              # guided setup — GPU detect, provider keys, local model pulls
nvh workstation --all -y         # Linux GPU desktop: launcher + WebUI + ComfyUI + studio packs
nvh webui                        # Setup > Models lets you choose exact local downloads
nvh studio --install starter -y  # rootless LLMs + agents + ComfyUI nodes + game-dev tools
nvh "your question"              # just ask — nvHive figures out the rest

nvh workstation --all -y creates a desktop launcher, starts the WebUI, prepares rootless local model tooling, installs ComfyUI with nvHive starter workflow examples, and adds AI Studio packs for LLMs, agents, ComfyUI nodes, and Linux game projects.

Use packs directly when you want a specific no-root lab:

nvh studio --list
nvh studio --models
nvh studio --install-models recommended -y
nvh studio --install llms -y
nvh studio --install agents -y
nvh studio --install comfy -y
nvh studio --install game -y

The WebUI setup wizard includes a model picker with GPU-fit badges, disk estimates, installed status, and a download queue. The ComfyUI step lets students select workflow examples and save a model download plan with source links, because many image/video weights are large or require upstream terms.

On first run, nvh launches a guided 3-step setup — GPU detection, provider keys, local model pulls. Works immediately with local models (no signup needed). Every step is skippable. Run nvh setup anytime to reconfigure.

nvHive 3-Step Setup Flow

WebUI

nvh webui launches a full-screen dashboard at localhost:3000 — chat, council mode, advisor status, analytics, system stats, and setup flows. NVIDIA corporate theme, keyboard-first (Ctrl+K command palette, Ctrl+B collapse sidebar).

nvHive WebUI walkthrough

GPU tier → model recommendations:

VRAM Text Model Vision Model Behavior
0 GB (no GPU) Cloud only Cloud fallback Free tiers first (Groq, LLM7, GitHub)
4-8 GB nemotron-mini moondream Basic local + desktop agent
12-16 GB qwen2.5-coder:7b minicpm-v Coding + vision local
24 GB gemma2:27b llama3.2-vision Strong text + best vision
48 GB llama3.3:70b llama3.2-vision Full power local
96+ GB Multiple 70B models llama3.2-vision Full local council, $0

Setup auto-detects your VRAM and recommends models that fit concurrently. No root/sudo needed for nvHive packs: tools install under ~/.nvh/ and ~/.local/bin. Full GPU guide


Why nvHive

Council scored 68% higher than a single model — at $0 cost. Three free providers running in parallel outperformed a single model on accuracy, completeness, and coherence. Benchmark details below.

  • Smart team assembly. nvHive generates expert agents for your question and matches each to the best LLM for their specialty — a "Security Engineer" agent routes to a security-strong provider, a "Database Expert" to one suited for database queries.
  • Automatic orchestration. Coding tasks get a planner + coder + reviewer. Complex questions get a council. Simple questions get the fastest advisor. All automatic.
  • Scales with what you have. 1 provider → single-model answers. 3+ providers → council on complex questions. Local GPU → free inference alongside cloud. DGX Spark → three 70B models in parallel, fully local.
  • 4-layer safety guardrails. Command blocklist, filesystem boundary enforcement, secrets redaction, and resource limits.

nvHive Smart Router


Architecture

nvHive Full Stack Architecture

9 layers from pip install to GPU inference — install, setup, 4 user interfaces, intent detection, 5 execution modes, smart routing, tool registry, 23+ AI providers, and the hardware stack. Local-first with cloud fallback. Architecture docs


Features

Desktop Agent

AI that sees your screen, controls mouse/keyboard, installs software, and navigates browsers — powered by local vision models.

nvh "take a screenshot and describe my desktop"
nvh "setup comfyui"                    # agent: git clone → pip install → launch → verify
nvh "open firefox and go to github.com"

Vision pipeline: screenshot → local vision model (llama3.2-vision / minicpm-v) → coordinate estimation → action → verify. Falls back to cloud vision if no local model. Works on Linux (X11), macOS, and Windows. Desktop agent docs

Agentic Coding

Multi-model coding agent with dynamic expert referral, iterative QA, parallel execution, and vision/browser tools.

nvh agent "Fix the streaming timeout bug in council.py"
nvh agent "Add unit tests for auth" --dir ./myproject
nvh agent "Build the notification service" --sandbox     # Docker-isolated
nvh review                     # multi-model code review
nvh test-gen nvh/core/council.py     # AI test generation

Key capabilities: dynamic expert referral, iterative QA refinement, parallel pipeline, Docker sandbox, execution checkpoints with rollback, LLM drift detection, multi-repo workspaces, and VS Code extension. Scales from no-GPU (fully cloud) to DGX Spark (3 local 70B models). Agentic coding docs

Council Mode

Run the same query through multiple providers in parallel, then synthesize. Expert personas generated per query, each assigned to a different model. Responses analyzed for agreement, synthesized by a non-member provider with a confidence score.

nvh convene "Should we use Redis or Postgres for sessions?"   # 3 models → synthesis
nvh throwdown "Review this architecture for scalability"      # 3-pass deep analysis with critique

Different models have different blind spots — council surfaces all perspectives. Council with 3 free providers costs $0. Council docs

Smart Routing

Each request is scored across capability (40%), cost (30%), latency (20%), and health (10%), then routed to the highest-scoring provider. Routing improves over time — after 20 queries per provider, it's fully data-driven.

nvh ask --escalate "Design a distributed lock manager"    # try free first, upgrade if uncertain
nvh ask --verify "Is eval() safe in Python?"              # cross-model verification
nvh routing-stats    # see learned vs static scores
nvh health           # provider resilience dashboard

Local-first with NVIDIA GPUs: simple queries route to your GPU via Ollama — no cloud, no cost, no data leaving your machine. --prefer-nvidia gives a 1.3x routing bonus to NVIDIA hardware. Routing docs


Providers

23 providers. 63 models. 25 free — no credit card required.

Tier Providers Rate Limits
Free (no signup) Ollama (local), LLM7 Unlimited / 30 RPM
Free (email signup) Groq, GitHub Models, Cerebras, SambaNova, Cohere, AI21, SiliconFlow, HuggingFace 15-30 RPM
Free (account) Google Gemini, Mistral, NVIDIA NIM 15-1000 RPM
Paid OpenAI, Anthropic, DeepSeek, Fireworks, Together, OpenRouter, Grok Pay per token

Full provider guide


Integrations

nvHive exposes a CLI (nvh), web dashboard (nvh webui), Python SDK (import nvh), MCP server for Claude Code, and OpenAI/Anthropic-compatible API proxies.

import nvh

response = await nvh.complete([{"role": "user", "content": "Explain quicksort"}])
result = await nvh.convene("Architecture review", cabinet="engineering")
Integration Setup
Anthropic SDK ANTHROPIC_BASE_URL=http://localhost:8000/v1/anthropic
OpenAI SDK OPENAI_BASE_URL=http://localhost:8000/v1/proxy
Claude Code claude mcp add nvhive -- python -m nvh.mcp_server
NemoClaw nvh nemoclaw --startNemoClaw docs

SDK & API reference | Claude Code integration | OpenClaw migration


Benchmark Results

Real data from NVIDIA DGX Spark (GB10, 120GB). 16 prompts across code generation, debugging, reasoning, math, creative writing, and Q&A. Judged by OpenAI with ground truth verification.

Mode Accuracy Completeness Coherence Overall Cost
Single Model (Nemotron Super) 5.5 5.7 5.0 5.1 $0.00
Council (Ollama + Groq + Google) 9.0 8.0 9.0 8.6 $0.00
nvh bench              # GPU speed (tokens/sec)
nvh bench -q           # speed + quality comparison
nvh health             # provider resilience

Results vary by hardware and workload — run nvh bench to measure on your setup.


Core Commands

Command What It Does
nvh "question" Smart route to best available model
nvh convene "question" Council consensus (3+ models)
nvh throwdown "question" Three-pass deep analysis with critique
nvh agent "task" Agentic coding with expert referral + QA
nvh review Multi-model code review
nvh test-gen file.py AI test generation with verification
nvh safe "question" Local only — nothing leaves your machine
nvh serve Start API server (OpenAI + Anthropic proxy)
nvh webui Launch web dashboard
nvh health Provider resilience dashboard
nvh bench GPU speed test (tokens/sec)
nvh setup Interactive provider setup
nvh doctor Full diagnostic dump

Full command reference (50+ commands)


Documentation

Guide Description
Getting Started First-time setup
Commands Full CLI reference (50+ commands)
Providers 23 providers, rate limits, free tiers
Council System Multi-LLM consensus with confidence scoring
Architecture System design and adaptive routing
GPU Detection Auto-detection, model selection, OOM protection
SDK & API Python SDK, REST API, proxies
Agent Tools Agent tools and capabilities
Configuration Configuration reference
Web UI Web dashboard
Deploy Without Root No-root install on servers
Windows Troubleshooting Encoding, segfaults, port issues
Releasing Release runbook

Important Notes

  • Data Privacy: Cloud providers transmit queries to third-party APIs subject to each provider's privacy policy. Use nvh safe or --prefer-nvidia to keep inference local.
  • AI Accuracy: AI-generated outputs may contain errors. Review agent-modified files before committing to production.
  • Security: Safety guardrails use pattern-matching heuristics. For sensitive environments, use --sandbox with Docker isolation.
  • Benchmarks: Results measured on NVIDIA DGX Spark reference hardware. Results vary by hardware, provider, and workload.

License

MIT License. See LICENSE for details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nvhive-0.33.1.tar.gz (628.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nvhive-0.33.1-py3-none-any.whl (546.8 kB view details)

Uploaded Python 3

File details

Details for the file nvhive-0.33.1.tar.gz.

File metadata

  • Download URL: nvhive-0.33.1.tar.gz
  • Upload date:
  • Size: 628.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nvhive-0.33.1.tar.gz
Algorithm Hash digest
SHA256 18e2392fedd3cac9fad3b24344da5b5cfdbf9678f4c2ca2d167d5c1450179b64
MD5 130a8a16020caf318ab6a125a24403b6
BLAKE2b-256 34b82b99bb5f47b5ea39f076d333b6e0f600999ac63381aeeb670edaa6643be5

See more details on using hashes here.

Provenance

The following attestation bundles were made for nvhive-0.33.1.tar.gz:

Publisher: publish.yml on thatcooperguy/nvHive

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nvhive-0.33.1-py3-none-any.whl.

File metadata

  • Download URL: nvhive-0.33.1-py3-none-any.whl
  • Upload date:
  • Size: 546.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nvhive-0.33.1-py3-none-any.whl
Algorithm Hash digest
SHA256 336bd1bdd19f6ce4496d9141f7439adcf49ae224577a76fa2448619791662b30
MD5 1474d70fbaf1295fcce76b3a9acbaf89
BLAKE2b-256 20b454774ebf5c3cca5a1f60cb0238df81bc0756aa1e9bde66e94a9e6a23e4f0

See more details on using hashes here.

Provenance

The following attestation bundles were made for nvhive-0.33.1-py3-none-any.whl:

Publisher: publish.yml on thatcooperguy/nvHive

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page