Local-first embedding server: vector generation + index/search over HTTP (ONNX on-device or API providers). The reference /embed server for CPersona.
Project description
CEmbedding
Local-first embedding server
Vector embeddings over a tiny HTTP contract.
On-device ONNX or any OpenAI-compatible API. The reference /embed server for CPersona.
Standalone repository — extracted from the (now private)
clotohub-serversmonorepo so it can be used on its own. ClotoCore users get this through the in-app marketplace (ClotoHub); everyone else can run it directly as described below.
What it is
A small server that turns text into vectors. It speaks a minimal HTTP contract so anything can call it — its primary consumer is CPersona, whose hybrid search uses it for the vector-similarity layer. It can run a model on-device via ONNX (no API key, no network) or proxy an OpenAI-compatible API.
It also exposes an MCP (stdio) surface and an optional persistent vector index (/index, /search), but the HTTP /embed endpoint is all CPersona needs.
The /embed contract
POST /embed
Request: { "texts": ["string", ...] } # non-empty array, max 100 per batch
Response: { "embeddings": [[float, ...], ...], "dimensions": <int> }
Point any client (e.g. CPersona's CPERSONA_EMBEDDING_URL / generic EMBEDDING_HTTP_URL) at http://127.0.0.1:8401/embed.
Quick Start (on-device ONNX)
Prerequisites: Python 3.10+
# Download a model into ./data/models (jina-v5-nano is what CPersona is tuned for)
uvx --from "cembedding[onnx]" cembedding-download-model --model jina-v5-nano
# Run the server (reads ./data/models from the current directory)
EMBEDDING_PROVIDER=onnx_jina_v5_nano uvx --from "cembedding[onnx]" cembedding
Or install it onto your PATH with pip install "cembedding[onnx]", then run
cembedding-download-model --model jina-v5-nano and cembedding.
From source (development):
git clone https://github.com/Cloto-dev/CEmbedding.git
cd CEmbedding
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install ".[onnx]"
python -m cembedding.download_model --model jina-v5-nano
EMBEDDING_PROVIDER=onnx_jina_v5_nano python -m cembedding # or: python server.py
You should see HTTP embedding endpoint started on http://127.0.0.1:8401/embed. Verify it:
curl -s http://127.0.0.1:8401/embed \
-H 'content-type: application/json' \
-d '{"texts":["hello world"]}' | head -c 200
Providers
Set EMBEDDING_PROVIDER:
| Value | Model | Notes |
|---|---|---|
onnx_jina_v5_nano |
jina-v5-nano (33M, 768d) | Local CPU, what CPersona is benchmarked against |
onnx_bge_m3 |
bge-m3 | Local CPU, larger / multilingual |
onnx_miniml |
all-MiniLM-L6-v2 (22M, 384d) | Local CPU, smallest |
mlx_bge_m3 |
bge-m3 (MLX) | Apple Silicon only — pip install ".[mlx]" |
api_openai |
provider's model | OpenAI-compatible API; needs EMBEDDING_API_KEY (+ optional EMBEDDING_API_URL, EMBEDDING_MODEL) |
Download a local model with cembedding-download-model --model {miniml,jina-v5-nano,bge-m3} (or python -m cembedding.download_model ... from a source checkout; fetched from HuggingFace into ./data/models, not committed to this repo).
Configuration
| Env var | Default | Description |
|---|---|---|
EMBEDDING_PROVIDER |
api_openai |
Provider (see table above) |
EMBEDDING_HTTP_PORT |
8401 |
HTTP port for /embed |
EMBEDDING_INDEX_ENABLED |
true |
Enable the persistent vector index endpoints (/index, /search, /remove, /purge) |
ONNX_MODEL_DIR |
(auto) | Override the model directory for ONNX providers |
ONNX_EP_PREFERENCE |
(auto) | ONNX execution providers, comma-separated. Empty = auto (CoreML on macOS, DirectML on Windows, else CPU; CPU always ensured) |
ONNX_MAX_SEQ_LEN |
2048 |
Max tokenization length (1–8192; MiniLM clamped to 512 internally) |
EMBEDDING_API_KEY |
— | Required for api_openai |
EMBEDDING_API_URL |
https://api.openai.com/v1/embeddings |
API endpoint for api_openai |
Use with CPersona
Run this server, then tell CPersona to use it:
# CPersona MCP config env
CPERSONA_EMBEDDING_MODE=http
CPERSONA_EMBEDDING_URL=http://127.0.0.1:8401/embed
Without an embedding server CPersona still works (FTS5 + keyword search); adding one enables the vector-similarity layer.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cembedding-0.5.0.tar.gz.
File metadata
- Download URL: cembedding-0.5.0.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.18 {"installer":{"name":"uv","version":"0.11.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5393dddfbf36d1be9cc5e762477cc68658076c329251e0efcad39ec5836284b
|
|
| MD5 |
6584b5a9ba0bc042c0f9b8e2c3542547
|
|
| BLAKE2b-256 |
8c31e2b8d0df68013bf4abacfcea0190547cc4a2261907d0f93af26ecbc1c888
|
File details
Details for the file cembedding-0.5.0-py3-none-any.whl.
File metadata
- Download URL: cembedding-0.5.0-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.18 {"installer":{"name":"uv","version":"0.11.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bef5fcb3efc196d437b7d56026439eb9add6acc4f8f22f85cf49d458c688ef3f
|
|
| MD5 |
270a5491815503d59077325431f4d455
|
|
| BLAKE2b-256 |
25161187bed6a927fe87976c914031dc4fc93a0b13545e4b5ac2358081f21ef4
|