cloooooo — SGLang + RAG hybrid + tools + router + structured outputs + eval
Project description
clovis
SGLang · Web Search · Deep Thinking · RAG · Embeddings · Structured Outputs
clovis is a Python client and API server for local LLMs (SGLang / Ollama), with built-in web search, multi-step deep research, RAG, embeddings, reranking, structured outputs, and an agentic pipeline.
Install
pip install clovis
Quick start
from clovis import cloooooo
ai = cloooooo() # connects to SGLang on localhost:61005
# Simple call
print(ai("Explain black holes"))
# With options
print(ai(
"Write a poem about the sea",
negative_prompt="no rhymes",
thinking=True, # enables extended reasoning
context="You are a 19th-century poet.",
))
# Streaming token by token
for token in ai.stream("Tell a short story"):
print(token, end="", flush=True)
# Conversation with memory
conv = ai.conversation(context="You are a finance expert.")
conv("Explain the CAPM model")
conv("And its limitations?") # remembers previous turn
API server
Start the server:
clovis serve --port 8000 --key sk-mytoken
All endpoints accept JSON over HTTP. Base URL: http://localhost:8000
/ia — Universal endpoint
curl -X POST http://localhost:8000/ia \
-H "Content-Type: application/json" \
-d '{"prompt": "What is Python?", "mode": "simple"}'
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt |
str | required | The question or instruction |
mode |
str | null |
simple · deep_thinking · ultra_deep_thinking |
use_web |
bool | false |
Inject live web search results |
thinking |
bool | false |
Enable extended reasoning (chain-of-thought) |
stream |
bool | false |
Stream response tokens via text/plain |
use_memory |
bool | false |
Load/save conversation history |
conversation_id |
str | null |
Conversation ID for memory |
context |
str | null |
Extra context injected into the prompt |
Modes
simple — Direct LLM call, fastest.
deep_thinking — Agentic web research pipeline (MiroFlow). Performs multi-step search, reasoning, and synthesis. Returns answer, sources, model_used, fallback_used.
curl -X POST http://localhost:8000/ia \
-d '{"prompt": "What are the geopolitical implications of AI?", "mode": "deep_thinking"}'
# → {"answer": "...", "sources": ["https://...", ...], "fallback_used": false}
ultra_deep_thinking — Multi-axis deep research with gap analysis. Decomposes the question into research axes, searches each independently, identifies knowledge gaps, fills them with additional searches, then synthesizes a comprehensive structured report with numbered sources.
curl -X POST http://localhost:8000/ia \
-d '{"prompt": "How does reinforcement learning work?", "mode": "ultra_deep_thinking", "stream": true}'
# streams progress: [axe:Definition] → [axe:History] → ... → final JSON
Other endpoints
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Server status and version |
/route |
POST | Route a prompt to the best mode automatically |
/embed |
POST | Text embeddings (768-dim, nomic-embed-text-v1.5) |
/rerank |
POST | Rerank documents by relevance to a query |
/vision |
POST | Image description (URL, file path, or base64) |
/structured |
POST | Structured JSON output from a JSON Schema |
/rag/ingest |
POST | Ingest a document (PDF, DOCX, TXT) into RAG |
/rag/ask |
POST | Ask a question over ingested documents |
/tools/exec |
POST | Execute a registered tool (calculator, etc.) |
/deep_think |
POST | Standalone deep research (streaming) |
/ultra_deep_thinking |
POST | Standalone ultra deep research |
/docs |
GET | Interactive API documentation |
/embed
curl -X POST http://localhost:8000/embed \
-d '{"texts": ["Hello world", "Machine learning basics"]}'
# → {"embeddings": [[...], [...]], "dim": 768}
/structured
curl -X POST http://localhost:8000/structured \
-d '{
"prompt": "Extract the person info: Alice is 30 years old",
"schema": {"type": "object", "properties": {"name": {"type": "string"}, "age": {"type": "integer"}}, "required": ["name", "age"]}
}'
# → {"result": {"name": "Alice", "age": 30}}
/rerank
curl -X POST http://localhost:8000/rerank \
-d '{
"query": "machine learning",
"documents": ["Deep learning is a subset of ML", "The weather is sunny", "Neural networks learn from data"]
}'
# → {"results": [{"document": "...", "score": 0.95}, ...]}
CLI
clovis "Explain black holes" # direct question
clovis "Write a poem" --no "no rhymes" # with negative prompt
clovis "Solve this problem" --think # reasoning mode
clovis repl # interactive conversation
clovis serve --port 8000 # start API server
clovis serve --port 8000 --key sk-mytoken # with API key auth
Configuration
export CLOVIS_LOCAL_URL="http://localhost:61005" # SGLang endpoint
export CLOVIS_MODEL="Qwen/Qwen3-32B-AWQ" # model name
export CLOVIS_API_KEY="sk-..." # server auth key
export SEARXNG_URL="http://localhost:8888" # SearXNG for web search
export MIROFLOW_DIR="/path/to/MiroFlow" # agentic pipeline dir
Requirements
- Python 3.10+
- SGLang or Ollama running locally
- SearXNG (optional, for web search modes)
License
MIT — Clovis Sfeir
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clovis-0.5.6-py3-none-any.whl.
File metadata
- Download URL: clovis-0.5.6-py3-none-any.whl
- Upload date:
- Size: 95.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4097c5dfe4eed3fdaa421f7eb0c2b96e15204588b23129f7d1ecc9b9ecad266d
|
|
| MD5 |
9549a1ac16994fe1385c6a748520cb56
|
|
| BLAKE2b-256 |
070732c708c4c4162b6a6b89bce9ab849acd544472fc806c97901e03c416cb75
|