Skip to main content

Serveur IA de Clovis — AKA cloclo

Project description

clovis

Web Search · Deep Research · RAG · Embeddings · Structured Outputs

PyPI version Python versions License Downloads


clovis is a Python client for the cloooooo.com AI API — and optionally a self-hosted server for those running their own GPU stack. It ships with multi-step web research, a full RAG pipeline, vector embeddings, reranking, structured JSON outputs, vision, and an agentic deep-research mode — all accessible via a single HTTP endpoint.

Features

  • Simple inference — one-line calls with streaming, negative prompts, and extended reasoning
  • Web search — live results injected into context, always date-aware
  • Deep thinking — multi-step agentic research pipeline with source citations
  • Ultra deep thinking — multi-axis research with automated gap analysis, 280+ sources synthesized into a structured report
  • RAG — ingest PDF, DOCX, TXT documents; semantic search over your corpus
  • Embeddings — 768-dim dense vectors
  • Reranking — cross-encoder reranking of document candidates
  • Structured output — JSON Schema-constrained generation
  • Vision — image description from URL, file path, or base64
  • Auto-routing — automatic mode selection based on query type
  • Conversation memory — short-term history per conversation ID

Installation

pip install clovis

Requires Python 3.10+


Quick start

No local setup required — connect directly to the hosted server:

from clovis import cloooooo

# Option A — Hosted server (no GPU, no config)
ai = cloooooo(base_url="http://cloooooo.com")
print(ai("Explain transformer architecture"))

# Option B — Self-hosted (see "Self-hosting" section below)
ai = cloooooo()  # connects to localhost:61005
# With options
response = ai(
    "Write a sonnet about entropy",
    negative_prompt="no rhymes",
    thinking=True,
    context="You are a physicist who loves poetry.",
)

# Streaming
for token in ai.stream("Describe the Big Bang in detail"):
    print(token, end="", flush=True)

# Multi-turn conversation
conv = ai.conversation(context="You are a senior software engineer.")
conv("Explain dependency injection")
conv("Show me a Python example")
conv("How would you test it?")

API server

All endpoints accept Content-Type: application/json. Streaming responses use text/plain.

Base URL: http://cloooooo.com (hosted) or http://localhost:8000 (self-hosted).


POST /ia — Universal endpoint

The main endpoint. Handles all inference modes.

curl -X POST http://cloooooo.com/ia \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is quantum entanglement?", "use_web": true}'

Parameters

Parameter Type Default Description
prompt str required The question or instruction
mode str null "simple" · "deep_thinking" · "ultra_deep_thinking"
use_web bool false Inject live web search results with current date
thinking bool false Enable extended reasoning (chain-of-thought)
stream bool false Stream tokens via text/plain
use_memory bool false Load and save conversation history
conversation_id str null Key for conversation memory
context str null System-level context injected before the prompt
negative_prompt str null Instructions for what to avoid

Response

{"response": "Quantum entanglement is a phenomenon where..."}

For deep_thinking:

{
  "answer": "...",
  "sources": ["https://...", "https://..."],
  "model_used": "miroflow:Qwen/Qwen3-32B-AWQ",
  "fallback_used": false
}

For ultra_deep_thinking (async job):

{"job_id": "b45d79...", "status": "pending", "poll": "/job/b45d79..."}

Modes

simple — Direct inference

Fast, direct LLM call. Optionally augmented with web search or reasoning.

import httpx

r = httpx.post("http://cloooooo.com/ia", json={
    "prompt": "Latest news on fusion energy",
    "use_web": True,
    "thinking": True,
})
print(r.json()["response"])

deep_thinking — Agentic web research

Multi-step research pipeline. Performs web searches, reasons over the results, and returns a structured answer with source citations.

r = httpx.post("http://cloooooo.com/ia", json={
    "prompt": "What are the geopolitical implications of AGI development?",
    "mode": "deep_thinking",
}, timeout=300)

data = r.json()
print(data["answer"])
print(data["sources"])

Streaming mode returns progress updates then the final JSON:

curl -N -X POST http://cloooooo.com/ia \
  -d '{"prompt": "Impact of interest rates on tech stocks", "mode": "deep_thinking", "stream": true}'

# [deep_thinking... 5s]
# [deep_thinking... 10s]
# ...
# {"answer": "...", "sources": [...], "fallback_used": false}

ultra_deep_thinking — Multi-axis deep research

The most thorough mode. Decomposes the question into research axes, searches each independently, identifies knowledge gaps, fills them, then synthesizes a structured report. Typically produces 10 000–15 000 character reports with 250–300 unique sources. Runs as an async job (15–25 min).

import httpx, time

# 1. Submit
r = httpx.post("http://cloooooo.com/ia", json={
    "prompt": "How does reinforcement learning from human feedback (RLHF) work?",
    "mode": "ultra_deep_thinking",
})
job_id = r.json()["job_id"]

# 2. Poll until done
while True:
    job = httpx.get(f"http://cloooooo.com/job/{job_id}").json()
    print(f"status: {job['status']} | {len(job['progress'])} steps logged")
    if job["status"] == "done":
        print(job["result"]["answer"][:500])
        break
    time.sleep(30)

Progress visible at each poll:

[décomposition] 5 axes : ['Définition', 'Historique', ...]
[axe:Définition] recherche en cours...
[axe:Définition] OK — 4 127 chars, 36 sources
...
[lacunes 1/2] 5 lacunes identifiées → 5/5 comblées
[synthèse] 15 sections · 65 000 chars · 295 sources...
[terminé] 14 067 chars · 280 sources uniques

Presets (via /ultra_deep_thinking endpoint):

Preset Axes Gap rounds Duration
fast 3 1 ~5 min
deep (default) 5 2 ~15 min
ultra 8 3 ~30 min

GET /health — Server status

curl http://cloooooo.com/health
{
  "status": "ok",
  "version": "0.5.11",
  "model": "Qwen/Qwen3-32B-AWQ",
  "modes": ["simple", "search", "thinking", "deep_thinking", "ultra_deep_thinking", "embed", "rerank", "vision"]
}

POST /embed — Text embeddings

r = httpx.post("http://cloooooo.com/embed", json={
    "texts": ["Hello world", "Machine learning basics"],
})
print(r.json()["dim"])          # 768
print(len(r.json()["embeddings"]))  # 2

POST /rerank — Document reranking

r = httpx.post("http://cloooooo.com/rerank", json={
    "query": "machine learning optimization",
    "documents": [
        "Gradient descent is an optimization algorithm for ML",
        "The weather in Paris is sunny today",
        "Adam optimizer adapts learning rates per parameter",
    ],
    "top_k": 2,
})
for item in r.json()["results"]:
    print(f"{item['score']:.3f}  {item['document'][:60]}")

POST /structured — JSON Schema output

r = httpx.post("http://cloooooo.com/structured", json={
    "prompt": "Describe the movie Inception",
    "schema": {
        "type": "object",
        "properties": {
            "title": {"type": "string"},
            "year": {"type": "integer"},
            "genres": {"type": "array", "items": {"type": "string"}},
        },
        "required": ["title", "year", "genres"],
    },
})
print(r.json()["result"])
# {"title": "Inception", "year": 2010, "genres": ["sci-fi", "thriller"]}

POST /vision — Image understanding

r = httpx.post("http://cloooooo.com/vision", json={
    "image": "https://example.com/photo.jpg",
    "prompt": "What objects do you see?",
})
print(r.json()["response"])

POST /rag/ingest + POST /rag/ask — RAG

httpx.post("http://cloooooo.com/rag/ingest", json={"path": "/path/to/doc.pdf"})

r = httpx.post("http://cloooooo.com/rag/ask", json={
    "question": "What are the main conclusions?",
    "top_k": 5,
})
print(r.json()["response"])

Supported formats: PDF, DOCX, TXT, Markdown.


Other endpoints

Endpoint Method Description
/job/{id} GET Poll async job status + progress
/route POST Auto-select the best mode for a prompt
/deep_think POST Standalone iterative deep research (streaming)
/tools GET List available tools
/rag/sources GET List ingested documents
/openapi.json GET OpenAPI schema
/docs GET Interactive API documentation

Streaming

import httpx

with httpx.stream("POST", "http://cloooooo.com/ia", json={
    "prompt": "Write a detailed explanation of CRISPR-Cas9",
    "stream": True,
}) as r:
    for chunk in r.iter_text():
        print(chunk, end="", flush=True)

Self-hosting

These components run server-side on cloooooo.com. No local installation is required if you use the hosted server.

To run your own instance (requires an NVIDIA GPU with 24 GB+ VRAM):

# 1. Start SGLang with Qwen3-32B-AWQ
python -m sglang.launch_server \
  --model-path Qwen/Qwen3-32B-AWQ \
  --port 61005 \
  --quantization awq

# 2. Start SearXNG (for web search modes)
docker run -p 8888:8080 searxng/searxng

# 3. Start clovis
clovis serve --port 8000
# Environment variables
export CLOVIS_LOCAL_URL="http://localhost:61005"
export CLOVIS_MODEL="Qwen/Qwen3-32B-AWQ"
export CLOVIS_API_KEY="sk-..."
export SEARXNG_URL="http://localhost:8888"

Then point the client to your server:

ai = cloooooo(base_url="http://your-server:8000")

License

MIT — Clovis Sfeir

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clovis-0.5.12-py3-none-any.whl (98.6 kB view details)

Uploaded Python 3

File details

Details for the file clovis-0.5.12-py3-none-any.whl.

File metadata

  • Download URL: clovis-0.5.12-py3-none-any.whl
  • Upload date:
  • Size: 98.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for clovis-0.5.12-py3-none-any.whl
Algorithm Hash digest
SHA256 585164b52046b2df7daf66ab01cdfc3128fb4a7162d3c7de95f56b5f87be69cb
MD5 441d82582a6acaf42d07673c1bf8014c
BLAKE2b-256 8c4a5618ba537cdb7ebe32355a459d1bc5e4b7735a4c481622e19b32df3fd487

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page