Skip to main content

cloooooo — SGLang + RAG hybrid + tools + router + structured outputs + eval

Project description

clovis

SGLang · Web Search · Deep Thinking · RAG · Embeddings · Structured Outputs

PyPI version Python License: MIT

clovis is a Python client and API server for local LLMs (SGLang / Ollama), with built-in web search, multi-step deep research, RAG, embeddings, reranking, structured outputs, and an agentic pipeline.


Install

pip install clovis

Quick start

from clovis import cloooooo

ai = cloooooo()                            # connects to SGLang on localhost:61005

# Simple call
print(ai("Explain black holes"))

# With options
print(ai(
    "Write a poem about the sea",
    negative_prompt="no rhymes",
    thinking=True,                         # enables extended reasoning
    context="You are a 19th-century poet.",
))

# Streaming token by token
for token in ai.stream("Tell a short story"):
    print(token, end="", flush=True)

# Conversation with memory
conv = ai.conversation(context="You are a finance expert.")
conv("Explain the CAPM model")
conv("And its limitations?")              # remembers previous turn

API server

Start the server:

clovis serve --port 8000 --key sk-mytoken

All endpoints accept JSON over HTTP. Base URL: http://localhost:8000

/ia — Universal endpoint

curl -X POST http://localhost:8000/ia \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is Python?", "mode": "simple"}'
Parameter Type Default Description
prompt str required The question or instruction
mode str null simple · deep_thinking · ultra_deep_thinking
use_web bool false Inject live web search results
thinking bool false Enable extended reasoning (chain-of-thought)
stream bool false Stream response tokens via text/plain
use_memory bool false Load/save conversation history
conversation_id str null Conversation ID for memory
context str null Extra context injected into the prompt

Modes

simple — Direct LLM call, fastest.

deep_thinking — Agentic web research pipeline (MiroFlow). Performs multi-step search, reasoning, and synthesis. Returns answer, sources, model_used, fallback_used.

curl -X POST http://localhost:8000/ia \
  -d '{"prompt": "What are the geopolitical implications of AI?", "mode": "deep_thinking"}'
# → {"answer": "...", "sources": ["https://...", ...], "fallback_used": false}

ultra_deep_thinking — Multi-axis deep research with gap analysis. Decomposes the question into research axes, searches each independently, identifies knowledge gaps, fills them with additional searches, then synthesizes a comprehensive structured report with numbered sources.

curl -X POST http://localhost:8000/ia \
  -d '{"prompt": "How does reinforcement learning work?", "mode": "ultra_deep_thinking", "stream": true}'
# streams progress: [axe:Definition] → [axe:History] → ... → final JSON

Other endpoints

Endpoint Method Description
/health GET Server status and version
/route POST Route a prompt to the best mode automatically
/embed POST Text embeddings (768-dim, nomic-embed-text-v1.5)
/rerank POST Rerank documents by relevance to a query
/vision POST Image description (URL, file path, or base64)
/structured POST Structured JSON output from a JSON Schema
/rag/ingest POST Ingest a document (PDF, DOCX, TXT) into RAG
/rag/ask POST Ask a question over ingested documents
/tools/exec POST Execute a registered tool (calculator, etc.)
/deep_think POST Standalone deep research (streaming)
/ultra_deep_thinking POST Standalone ultra deep research
/docs GET Interactive API documentation

/embed

curl -X POST http://localhost:8000/embed \
  -d '{"texts": ["Hello world", "Machine learning basics"]}'
# → {"embeddings": [[...], [...]], "dim": 768}

/structured

curl -X POST http://localhost:8000/structured \
  -d '{
    "prompt": "Extract the person info: Alice is 30 years old",
    "schema": {"type": "object", "properties": {"name": {"type": "string"}, "age": {"type": "integer"}}, "required": ["name", "age"]}
  }'
# → {"result": {"name": "Alice", "age": 30}}

/rerank

curl -X POST http://localhost:8000/rerank \
  -d '{
    "query": "machine learning",
    "documents": ["Deep learning is a subset of ML", "The weather is sunny", "Neural networks learn from data"]
  }'
# → {"results": [{"document": "...", "score": 0.95}, ...]}

CLI

clovis "Explain black holes"                # direct question
clovis "Write a poem" --no "no rhymes"      # with negative prompt
clovis "Solve this problem" --think         # reasoning mode
clovis repl                                 # interactive conversation
clovis serve --port 8000                    # start API server
clovis serve --port 8000 --key sk-mytoken   # with API key auth

Configuration

export CLOVIS_LOCAL_URL="http://localhost:61005"   # SGLang endpoint
export CLOVIS_MODEL="Qwen/Qwen3-32B-AWQ"           # model name
export CLOVIS_API_KEY="sk-..."                     # server auth key
export SEARXNG_URL="http://localhost:8888"         # SearXNG for web search
export MIROFLOW_DIR="/path/to/MiroFlow"            # agentic pipeline dir

Requirements

  • Python 3.10+
  • SGLang or Ollama running locally
  • SearXNG (optional, for web search modes)

License

MIT — Clovis Sfeir

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clovis-0.5.6-py3-none-any.whl (95.9 kB view details)

Uploaded Python 3

File details

Details for the file clovis-0.5.6-py3-none-any.whl.

File metadata

  • Download URL: clovis-0.5.6-py3-none-any.whl
  • Upload date:
  • Size: 95.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for clovis-0.5.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4097c5dfe4eed3fdaa421f7eb0c2b96e15204588b23129f7d1ecc9b9ecad266d
MD5 9549a1ac16994fe1385c6a748520cb56
BLAKE2b-256 070732c708c4c4162b6a6b89bce9ab849acd544472fc806c97901e03c416cb75

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page