UAI - Unified AI CLI: one tool for all AI providers with persistent context, intelligent routing and automatic fallback
Project description
UAI — Unified AI CLI
One tool. All AI providers. Persistent context. Intelligent routing. Zero lock-in.
uai is an installable Python CLI that integrates multiple AI providers (Claude, Gemini, Codex, Qwen, Ollama, DeepSeek, Groq) under a single interface. It manages credentials, routes requests intelligently, maintains its own persistent context (so you can switch providers mid-conversation), and never leaves you without a response thanks to automatic fallback.
Install
pip install uai-cli
Or from source:
git clone https://github.com/your-org/agent-skills
cd agent-skills
pip install -e .
Quick Start
# First-time setup (detects installed CLIs, creates ~/.uai/config.yaml)
uai setup
# Connect your providers
uai connect gemini # OAuth via Gemini CLI (free)
uai connect qwen # OAuth via qwen-code CLI (1000 req/day free)
uai connect claude # API key
uai connect codex # API key
# Ask anything
uai ask "explain this error: TypeError: NoneType is not subscriptable"
# Continue the conversation (context is persisted automatically)
uai ask "how would I fix it?"
# Switch providers mid-conversation — context is injected automatically
uai ask --provider gemini "now show me the corrected code"
# Interactive chat session
uai chat
# Code tasks (routes to best code-focused provider)
uai code "implement a binary search tree in Python"
# Multi-AI orchestration
uai orchestrate "review the architecture of src/"
Features
| Feature | Description |
|---|---|
| Multi-provider | Claude, Gemini, Codex, Qwen, Ollama, DeepSeek, Groq |
| Dual backend | Each provider supports API (SDK) and CLI — CLI preferred when free |
| Persistent context | SQLite sessions at ~/.uai/sessions/ — independent of provider context |
| Provider switching | Change providers mid-conversation; history is reformatted and injected |
| Intelligent routing | 2-stage classification (keyword + LLM) → scoring → best available provider |
| Automatic fallback | 3-layer resilience: retry → cross-provider failover → API→CLI degradation |
| Quota tracking | Per-provider usage, cost (USD), alerts before hitting limits |
| Auto-install CLIs | uai setup --install installs missing provider CLIs via npm/curl |
| Multi-AI teams | 8 orchestration patterns: parallel analysis, cross-validation, etc. |
| Cost-zero default | Free providers and free CLI backends are always tried first |
| Debug mode | --debug / -d shows full trace: routing, attempts, errors, timings |
| File access control | Per-provider readonly/readwrite via /access — bulk with all |
Providers
| Provider | Free Tier | Backend | Best For |
|---|---|---|---|
| Gemini | CLI (unlimited) | CLI: gemini -m MODEL -p / API: google-genai |
Architecture, long context (1M tokens) |
| Qwen | CLI (1000 req/day) | CLI: qwen -p / API: OpenRouter |
Code review, batch processing |
| Ollama | Local (unlimited) | API: OpenAI-compatible local | Privacy, offline use |
| DeepSeek | Free tier | API: OpenAI-compatible | Cost-efficient general tasks |
| Groq | Free tier | API: OpenAI-compatible | Ultra-low latency |
| Claude | Paid | CLI: claude -p / API: anthropic |
Consolidation, strategy |
| Codex | Paid | CLI: codex exec / API: openai |
Debugging, implementation |
Commands
uai setup First-time wizard: detect CLIs, create config
uai setup --install Auto-install missing provider CLIs
uai connect <provider> Connect a provider account (API key or CLI auth)
uai ask "prompt" Single query with session context
uai ask --provider gemini "..." Force a specific provider
uai ask --free "..." Cost-zero providers only
uai ask --new "..." Ignore session context for this query
uai ask --session myproject "..." Use a named session
uai ask --debug "..." Show full provider trace (routing, errors, timings)
uai chat Interactive REPL with persistent context
uai chat --session myproject Named session
uai chat --provider claude Force provider for the session
uai chat --debug Show debug trace after every response
uai code "task" Code-focused task (routes to code providers)
uai orchestrate "task" Multi-AI team orchestration
uai sessions list List all sessions
uai sessions show [name] View session history
uai sessions delete [name] Delete a session
uai sessions export [name] --format markdown Export conversation
uai status Provider health dashboard
uai quota Usage and cost report
uai config show Show current configuration
uai config set defaults.cost_mode balanced Change a config value
uai providers List providers with status
Chat REPL Commands
Inside uai chat, use slash commands:
/provider gemini Switch provider (context is carried over)
/provider Return to automatic routing
/history Show conversation history
/clear Clear current session history
/export [file.md] Export session to markdown
/status Show provider status
/session Show current session and list available ones
/access <prov> readonly Block file writes for a provider
/access <prov> readwrite Allow file writes for a provider
/access all readonly Set readonly for ALL providers at once
/access all readwrite Set readwrite for ALL providers at once
/providers List providers with file_access column
/exit Exit chat
Context Management
UAI maintains its own conversation history in SQLite databases at ~/.uai/sessions/. This is independent of any provider's native context.
Injection Strategies
When sending a request, UAI automatically selects the best strategy:
| Strategy | When Used | How |
|---|---|---|
| full | History fits in provider's context window | Inject all messages |
| windowed | History is long but recent turns are enough | Inject last N turns |
| summarized | History too long for windowed | Auto-summarize old turns (using free provider), inject summary + recent turns |
Switching Providers Mid-Conversation
uai ask "explain this function" # uses Gemini (free)
uai ask "now refactor it" # still Gemini
uai ask --provider claude "write unit tests" # switches to Claude
# Claude receives the full conversation history, reformatted to its native API format
History is automatically adapted to each provider's format:
- Claude / OpenAI:
[{"role": "user", "content": "..."}, {"role": "assistant", ...}] - Gemini:
[{"role": "user", "parts": [{"text": "..."}]}, {"role": "model", ...}] - CLI providers: Plain text
User: ...\nAssistant: ...
Orchestration
UAI includes 8 multi-AI team patterns:
| Pattern | Execution | Providers |
|---|---|---|
| Full Analysis | Parallel → consolidate | Gemini + Codex + Qwen → Claude |
| Daily Dev | Sequential escalation | Qwen → Gemini → Claude |
| Critical Debug | Sequential | Codex → Qwen → Gemini → Claude |
| LGPD Audit | Parallel → consolidate | Gemini + Qwen → Claude |
| Batch Processing | Parallel workers | Qwen + Gemini |
| Brainstorm | Parallel → synthesize | All → Claude |
| Cross-Validation | Sequential | Producer → Validator → Arbiter |
| Specialist + Generalist | Parallel | Specialist + second opinion |
uai orchestrate "perform a full security audit of src/"
# Runs parallel analysis on Gemini, Codex, and Qwen
# Claude consolidates results into a unified report
Configuration
Config file: ~/.uai/config.yaml (created by uai setup, see uai.yaml.example).
version: 1
defaults:
session: default
cost_mode: free_only # free_only | balanced | performance
context_strategy: auto # auto | full | windowed | summarized
providers:
gemini:
enabled: true
default_model: flash
preferred_backend: cli # CLI is free and preferred
priority: 5
qwen:
enabled: true
default_model: coder
preferred_backend: cli # qwen-code OAuth is free (1000 req/day)
priority: 4
daily_limit: 1000
claude:
enabled: true
default_model: sonnet
preferred_backend: api
priority: 2 # Paid — use only when needed
routing:
fallback_chain: [gemini, qwen, ollama, claude, codex]
context:
summarize_with: gemini # free provider used for auto-summarization
max_history_tokens: 50000
keep_recent_turns: 10
quota:
alert_threshold_usd: 1.0
alert_threshold_percent: 80
Routing
Requests are classified and routed to the best scoring provider:
| Task Type | Keywords | Default Providers |
|---|---|---|
| Debugging | bug, error, fix, debug, traceback | Codex, Claude, Gemini |
| Code Generation | implement, write, create, generate | Codex, Qwen, Claude |
| Code Review | review, audit, check, quality | Qwen, Gemini, Claude |
| Architecture | architect, design, pattern, solid | Gemini, Claude |
| Long Context | analyze, large file, codebase | Gemini (1M tokens) |
| General Chat | explain, what, how, describe | Gemini, Qwen, Ollama |
Scoring (0-100): capability match (0-40) + cost bonus for free (0-30) + priority (0-20) + recent success rate (0-10).
Fallback
3-layer automatic fallback:
- Intra-provider retry: 3 attempts with backoff (5s / 15s / 45s)
- Cross-provider failover: rate limit or auth error → next in fallback chain
- Graceful degradation: API fails → tries CLI backend of same provider
Debug Mode
Add --debug (or -d) to any ask or chat command to see a full execution trace:
uai ask "fix this bug" --debug
uai chat --debug
The debug panel shows every event with relative timestamps:
╭──────────────── uai debug trace ────────────────╮
│ +1.5s ROUTING qwen CLI · qwen3-coder routing=1.5s
│ alternatives: gemini, claude
│ +1.6s ATTEMPT qwen #1 via stream
│ +73.7s FALLBACK qwen failed
│ Qwen CLI timed out after 120s | stderr: ...
│ → trying gemini
│ +73.7s ATTEMPT gemini #1 via stream
│ +164s DONE OK total=164.58s
╰─────────────────────────────────────────────────╯
File Access Control
Control whether providers are allowed to write files:
# Inside uai chat:
/access all readonly # block writes for all providers
/access all readwrite # allow writes for all providers
/access gemini readonly # per-provider control
Or via config:
uai config set providers.qwen.file_access readonly
Packaging
The project uses Hatchling as its build backend.
Build a distribution
pip install build
python -m build
# produces dist/uai-X.Y.Z.tar.gz and dist/uai-X.Y.Z-py3-none-any.whl
Publish to PyPI
pip install twine
twine upload dist/*
Or with Hatch directly:
pip install hatch
hatch build
hatch publish # prompts for PyPI token
Bump version before publishing
Edit pyproject.toml and src/uai/__init__.py:
# pyproject.toml
version = "0.2.0"
# src/uai/__init__.py
__version__ = "0.2.0"
Development
pip install -e ".[dev]"
pytest tests/
Optional privacy features (PII anonymization via Presidio):
pip install -e ".[privacy]"
Adding a Provider Plugin
[project.entry-points."uai.providers"]
myprovider = "mypkg.provider:MyProvider"
Legacy Documentation
Original orchestration skill documentation preserved in legacy/:
| File | Contents |
|---|---|
legacy/SKILL.md |
Original Claude Code skill policy |
legacy/ai-catalog.md |
AI CLI catalog with calling conventions |
legacy/calling-conventions.md |
Exact CLI commands, timeouts, output parsing |
legacy/team-patterns.md |
8 multi-AI team patterns (codified in orchestration/patterns.py) |
legacy/examples.md |
6 practical orchestration examples |
legacy/privacy-tools.md |
LGPD compliance and PII anonymization |
Author
Diego Câmara — @diegocamara89
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file uai_cli-0.1.0.tar.gz.
File metadata
- Download URL: uai_cli-0.1.0.tar.gz
- Upload date:
- Size: 127.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff382b2d4f01824256592bf500c94e9c34ce5f6593d981fbe45628094e14218d
|
|
| MD5 |
372a029760ea7428ebdfb1884a248bb8
|
|
| BLAKE2b-256 |
8c1f971c04528bceaf394224f71d16bb4dd3862a7086e8a07320ed10cffe5044
|
File details
Details for the file uai_cli-0.1.0-py3-none-any.whl.
File metadata
- Download URL: uai_cli-0.1.0-py3-none-any.whl
- Upload date:
- Size: 116.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aebcfd1978113ea993ce48f8ce4164b30a6fcfe770302ab5a530ca0d317a4d05
|
|
| MD5 |
1ba0a9eeb17454bb6d66fafd5b659864
|
|
| BLAKE2b-256 |
bc4cfebc1f7e0ccfe0a4df97cdcfd0de331c37974c5eb64c3354ae1e1f8bf05e
|