Memory retrieval for AI agents — integer-pointer fidelity guarantee, 93.4% R@1 on LongMemEval_S (hybrid tier). By Hermes Labs.

These details have not been verified by PyPI

Project links

Project description

cogito-ergo

Memory retrieval for AI agents with structural fidelity. Two purpose-built paths: the existing /recall scores 85% R@1 on cogito's internal 31-case atomic-fact eval; the new /recall_hybrid tier scores 93.4% R@1 on LongMemEval_S (session-retrieval benchmark, see Hybrid recall). Different workloads, different tools. The filter LLM outputs only integers — it structurally cannot corrupt, rephrase, or hallucinate into the content returned to your agent.

The Problem

Every retrieval system that uses an LLM to select or rank memories has the same failure mode: the LLM rephrases on the way out. You store "auth tokens expire after 3600 seconds" and get back "authentication has a configurable timeout." The specific fact is gone.

Raw vector search returns candidates by similarity, but precision plateaus at 50–60% R@1 on real workloads
LLM-based re-rankers improve relevance but generate text — they summarize, merge, or hallucinate into the content your agent receives
Full RAG pipelines add latency and cost without solving the fidelity problem

cogito-ergo fixes this structurally. The filter LLM outputs only integer pointers ([3, 7, 12]). The server dereferences them to verbatim stored text. The LLM never sees, generates, or touches memory content. Fidelity is architectural, not a prompting convention.

Mode	R@1	hit@any	Latency
Combined (snapshot + recall)	85%	96%	1303ms
recall only	63%	81%	1197ms
recall_b (zero-LLM)	56%	96%	127ms

31 test cases, qwen3.5:2b filter model, fully local. $0/month.

Architecture

                    ┌─────────────────────────────────────────────┐
                    │              cogito-ergo server              │
                    │                 :19420                       │
                    └──────────────────┬──────────────────────────┘
                                       │
                    ┌──────────────────▼──────────────────────────┐
                    │             SNAPSHOT LAYER                   │
                    │   Compressed markdown index (~741 tokens)    │
                    │   Built once from corpus via `cogito snapshot`│
                    │   Returned with /recall — no vector search   │
                    │   Solves cross-reference queries (0%→50% R@1)│
                    └──────────────────┬──────────────────────────┘
                                       │
                    ┌──────────────────▼──────────────────────────┐
                    │         STAGE 1 — recall_b (zero-LLM)       │
                    │   Query decomposition → sub-queries          │
                    │   Stop-word stripping + bigrams + trigrams   │
                    │   Vocab expansion (cogito calibrate)         │
                    │   Up to 8 sub-queries, merged with RRF       │
                    │   Latency: ~127ms                            │
                    └──────────────────┬──────────────────────────┘
                                       │ up to 100 candidates
                    ┌──────────────────▼──────────────────────────┐
                    │       STAGE 2 — integer-pointer filter       │
                    │   Filter LLM sees: [1] text  [2] text ...   │
                    │   Filter LLM outputs: [3, 7, 12]  ← ONLY    │
                    │   Server fetches candidates[3], [7], [12]   │
                    │   Returns: verbatim stored text              │
                    │   Added latency: ~1176ms                     │
                    └─────────────────────────────────────────────┘

The filter LLM never generates memory text. Out-of-range integers are silently ignored. Fidelity is a structural property of the pipeline, not a prompting convention.

Benchmarks

Measured 2026-03-28. 31 test cases, qwen3.5:2b as filter model (local Ollama).

Mode	R@1	hit@any	MRR	Latency
Combined (snapshot + recall)	85%	96%	0.878	1303ms
recall only	63%	81%	—	1197ms
recall_b (zero-LLM)	56%	96%	—	127ms
snapshot only	41%	—	—	—

Key results:

Snapshot layer contributes +15% hit@any vs recall-only
Cross-reference queries: recall alone gets 0% R@1; combined gets 50%
recall_b matches combined hit@any (96%) at 10x lower latency — use it when cost matters

Quick Start

1. Install

pip install cogito-ergo

2. Pull Ollama models

ollama pull mistral:7b
ollama pull nomic-embed-text

Requires a running Ollama instance.

3. Configure filter LLM

# Option A: direct Anthropic key
export ANTHROPIC_API_KEY=sk-ant-...

# Option B: any OpenAI-compatible gateway (local LM Studio, OpenClaw, etc.)
export COGITO_FILTER_ENDPOINT=http://your-gateway/
export COGITO_FILTER_TOKEN=your-token
export COGITO_FILTER_MODEL=anthropic/claude-haiku-4-5

Or via .cogito.json in your working directory:

{
  "filter_endpoint": "http://your-gateway/",
  "filter_token": "your-token",
  "filter_model": "anthropic/claude-haiku-4-5"
}

4. Start the server

cogito-server
# or: cogito server

5. First recall

cogito recall "what did we decide about the auth architecture"

curl -X POST http://127.0.0.1:19420/recall \
  -H "Content-Type: application/json" \
  -d '{"text": "auth architecture decisions"}'

Hybrid recall (93.4% R@1 on LongMemEval_S)

recall_hybrid is an opt-in retrieval path that ports the architecture benchmarked at 93.4% R@1 on LongMemEval_S (from a 56% mem0 baseline). It stays in the same integer-pointer fidelity contract as /recall — only indices cross the LLM boundary.

Query
  │
  ▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 1 — hybrid retrieval (zero-LLM)                          │
│   • sub-query decomposition + vocab expansion (recall_b logic) │
│   • dense retrieval with nomic search_query: / search_document:│
│     prefixes                                                    │
│   • BM25 over the candidate pool (bm25s, optional extra)       │
│   • Reciprocal Rank Fusion across runs                         │
│   • cosine-blended rerank against the original query           │
└───────────────────────────────────────────────────────────────┘
  │
  ▼
┌───────────────────────────────────────────────────────────────┐
│ Router (regex classifier)                                      │
│   • "you told me" / "you said"  → skip (keep Stage 1)          │
│   • "how many" / "what date"    → call cheap filter            │
│   • everything else             → keep Stage 1 at filter tier  │
└───────────────────────────────────────────────────────────────┘
  │
  ▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 2 — cheap filter  (tier="filter", default)               │
│   500-char snippets, integer-pointer output                    │
└───────────────────────────────────────────────────────────────┘
  │
  ▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 3 — flagship rerank  (tier="flagship", opt-in)           │
│   2000-char snippets, stronger model, integer output           │
│   Called when Stage 1 confidence is low or filter failed       │
└───────────────────────────────────────────────────────────────┘

Tier tradeoffs

Tier	Latency	R@1 (LongMemEval_S)	External calls	When to use
`zero_llm`	~500ms	—	none	latency-sensitive paths, cost-sensitive ops
`filter`	~1300ms	90%+	cheap filter LLM	default; temporal/counting queries benefit
`flagship`	~3500ms	93.4%	filter + flagship model	hard queries, long sessions, benchmark setup

Quick start

# Optional dependency for best BM25 fusion (zero deps fallback if absent)
pip install cogito-ergo[hybrid]

# Opt-in: set a filter endpoint (any OpenAI-compatible API)
export COGITO_FILTER_ENDPOINT=https://dashscope-intl.aliyuncs.com/compatible-mode/v1
export COGITO_FILTER_TOKEN=sk-your-key
export COGITO_FILTER_MODEL=qwen-turbo

# Optional: flagship tier (stronger model, 4x larger context window)
export COGITO_FLAGSHIP_ENDPOINT=https://dashscope-intl.aliyuncs.com/compatible-mode/v1
export COGITO_FLAGSHIP_TOKEN=sk-your-key
export COGITO_FLAGSHIP_MODEL=qwen-max
# Or simpler — if DASHSCOPE_API_KEY is set, the flagship tier auto-configures
export DASHSCOPE_API_KEY=sk-your-key

cogito recall-hybrid "auth architecture decisions" --tier filter

HTTP:

curl -X POST http://127.0.0.1:19420/recall_hybrid \
  -H "Content-Type: application/json" \
  -d '{"text": "how many auth migrations have we done", "tier": "flagship", "limit": 10}'

Python:

from cogito.recall_hybrid import recall_hybrid
from cogito.config import load, mem0_config
from mem0 import Memory

cfg = load()
mem = Memory.from_config(mem0_config(cfg))
hits, method = recall_hybrid(mem, "auth tokens", user_id="agent", cfg=cfg, tier="filter")

Graceful degradation: if the filter endpoint isn't configured, tier="filter" falls back to zero_llm. If the flagship endpoint isn't configured, tier="flagship" falls back to filter (which itself may degrade to zero_llm). Nothing raises on missing credentials.

Regression notice

The hybrid path was validated at 93.4% R@1 on LongMemEval_S (multi-turn dialog retrieval with turn-level chunking and session-date scaffolds). On the internal 31-case eval (which measures keyword-recall over a mem0 store of short atomic memories), /recall_hybrid scores lower on R@1 than the existing /recall path — they solve slightly different problems. The hybrid path wins on hit@any, semantic-gap queries, and multi-memory aggregation; the existing path wins on prefix-style direct lookup. Default behavior is unchanged — /recall still drives cogito recall. Use /recall_hybrid or cogito recall-hybrid when you need the hybrid trait.

Claude Code session memory

cogito-ergo v0.3.0 can ingest your Claude Code sessions and query them with the same turn-pair chunking used in the LongMemEval benchmark.

Why it matters: The 93.4% R@1 benchmark result requires role-structured session data (user+assistant turn pairs). Flat atomic memories don't have this structure. Claude Code stores every session as role-structured JSONL — wiring it to cogito uses the same retrieval architecture as the benchmark, though Claude Code sessions have not been independently evaluated at the same scale.

Install:

# No new dependencies — uses existing chromadb + Ollama (nomic-embed-text)

Ingest:

# Preview (dry run)
python3 -m cogito.ingest_claude_sessions --since 2026-04-11 --dry-run

# Ingest last 7 days
python3 -m cogito.ingest_claude_sessions --since 2026-04-11

# Ingest all sessions (may take a few minutes)
python3 -m cogito.ingest_claude_sessions

Query:

from cogito.recall_sessions import query_sessions, query_both

# Session history only
results = query_sessions("what did we discuss about LPCI last week?", top_k=3)

# Atomic facts + session history side-by-side (no auto-merge)
both = query_both("cogito architecture decisions")

MCP tools (new in v0.3.0):

cogito_recall_sessions — query Claude Code sessions
cogito_recall_both — 3 atomic + 3 session results side-by-side

Expected accuracy: ~80-93% depending on query specificity (same pipeline as the benchmark; accuracy drops when querying across very long sessions because the session embedding averages across all turn content).

Privacy: All data stays local. Embedding uses Ollama (localhost:11434, nomic-embed-text). ChromaDB at ~/.cogito/store. Nothing sent to any cloud API. Ingestion is explicit — nothing runs automatically. Re-running ingest is idempotent (dedup via content hash).

Full demo: docs/claude-code-memory-demo.md

HTTP API

All endpoints return JSON. Server runs on port 19420 by default.

`GET /health`

{"status": "ok", "count": 1484, "version": "0.2.0", "calibrated": true, "snapshot": true}

Fields: count = total memories in store; calibrated = vocab_map present; snapshot = snapshot.md exists.

`GET /snapshot`

Returns the compressed index markdown built by cogito snapshot.

{"snapshot": "## Projects\n- **cogito-ergo** — ...", "path": "/home/user/.cogito/snapshot.md"}

Returns 404 if no snapshot has been built yet.

`POST /query`

Narrow vector search. L2 threshold filter only. No LLM call. Fast.

Request:

{"text": "query string", "limit": 5}

Response:

{"memories": [{"text": "...", "score": 93.4}]}

`POST /recall`

Broad search + integer-pointer filter. Two stages: zero-LLM RRF candidate pool, then cheap LLM selects by index.

Request:

{"text": "query string", "limit": 50, "threshold": 400}

Response:

{"memories": [{"text": "...", "score": 93.4}], "method": "filter"}

method field: "filter" = filter ran successfully; "fallback_*" = graceful degradation, all candidates returned instead. Possible fallbacks: fallback_no_endpoint, fallback_unreachable, fallback_parse_error, fallback_error.

`POST /recall_b`

Zero-LLM recall only. Sub-query decomposition + RRF. 127ms latency. Same hit@any as combined (96%) at lower cost.

Request:

{"text": "query string", "limit": 50}

Response:

{"memories": [{"text": "...", "score": 0.016}], "method": "decompose_4_v"}

method field: "decompose_N" = N sub-queries ran; "decompose_N_v" = vocab expansion applied.

`POST /recall_hybrid`

Hybrid BM25 + dense + RRF retrieval with tiered LLM escalation. Port of the architecture that reached 93.4% R@1 on LongMemEval_S. See Hybrid recall for the full diagram and tradeoffs.

Request:

{"text": "query string", "limit": 50, "tier": "filter", "top_k": 5}

tier is one of "zero_llm", "filter" (default), "flagship". top_k is how many candidates the reranker sees (default 5).

Response:

{"memories": [{"text": "...", "score": 0.72}], "method": "hybrid_12_bm25|filter"}

method field encodes the path taken (e.g. "hybrid_24_bm25|default_s1" = Stage 1 order kept; "hybrid_12_bm25|filter" = cheap filter reranked; "hybrid_8_bm25|filter|flagship" = both reranks ran; "hybrid_6_nobm25_v|…" = bm25s not installed, vocab expansion active).

`POST /store`

Write one memory verbatim. No extraction LLM. Agent decides the content.

This is the preferred write path for agent-curated content.

Request:

{"text": "Switched from JWT to session tokens on 2026-03-27 due to compliance requirement", "id": "<optional uuid>"}

Response:

{"id": "abc123...", "text": "Switched from JWT to session tokens..."}

`POST /add`

Feed text through mem0's extraction LLM before storing. Extracts multiple atomic facts from unstructured input.

Use when you have raw/unstructured text and want automatic fact extraction.

Request:

{"text": "free-form text to remember"}

Response:

{"count": 3, "memories": ["extracted fact 1", "extracted fact 2", "extracted fact 3"]}

CLI Reference

All CLI commands talk to the running HTTP server.

Command	Description
`cogito recall "query"`	Two-stage recall via running server
`cogito recall "query" --limit 50 --raw`	Raw JSON output
`cogito recall-hybrid "query" --tier filter`	Hybrid BM25+dense+RRF recall (93.4% R@1 arch)
`cogito recall-hybrid "query" --tier flagship --top-k 5`	+ flagship escalation on hard queries
`cogito query "query"`	Simple vector query, no filter
`cogito add "text"`	Add a memory via /add (mem0 extraction)
`cogito seed ~/notes/`	Bulk-seed from markdown files via /store
`cogito seed ~/notes/ --add`	Bulk-seed using /add (extraction mode)
`cogito seed ~/notes/ --dry-run`	Preview without writing
`cogito seed ~/notes/ --glob "*.txt"`	Custom file pattern
`cogito snapshot`	Build compressed index layer
`cogito snapshot --rebuild`	Force rebuild of snapshot
`cogito snapshot --dry-run`	Preview snapshot without writing
`cogito calibrate`	Build vocab bridge from corpus (one-time)
`cogito calibrate --dry-run`	Preview vocab mappings
`cogito health`	Check server status
`cogito server`	Start the server (alias for cogito-server)
`cogito-server --port 19420`	Start server directly
`cogito-server --config /path/to.json`	Start with explicit config file

Configuration

Priority: env vars > .cogito.json > defaults.

Config file is searched at ./.cogito.json (cwd) then ~/.cogito/config.json.

Env var	Config key	Default	Description
`COGITO_PORT`	`port`	`19420`	Server port
`COGITO_USER_ID`	`user_id`	`"agent"`	Memory namespace (isolates stores)
`COGITO_FILTER_ENDPOINT`	`filter_endpoint`	—	OpenAI-compatible base URL for filter LLM
`COGITO_FILTER_TOKEN`	`filter_token`	—	Bearer token for filter endpoint
`COGITO_FILTER_MODEL`	`filter_model`	`anthropic/claude-haiku-4-5`	Filter LLM model name
`COGITO_FILTER_TIMEOUT_MS`	`filter_timeout_ms`	`12000`	Filter LLM timeout in ms
`ANTHROPIC_API_KEY`	`anthropic_api_key`	—	Direct Anthropic key (alternative to endpoint+token)
`COGITO_STORE_PATH`	`store_path`	`~/.cogito/store`	ChromaDB persistence path
`COGITO_COLLECTION`	`collection`	`cogito_memory`	ChromaDB collection name
`COGITO_OLLAMA_URL`	`ollama_url`	`http://localhost:11434`	Ollama base URL
`COGITO_LLM_MODEL`	`llm_model`	`mistral:7b`	LLM for fact extraction (/add)
`COGITO_EMBED_MODEL`	`embed_model`	`nomic-embed-text`	Embedding model
`COGITO_RECALL_LIMIT`	`recall_limit`	`50`	Candidate pool size for /recall and /recall_b
`COGITO_RECALL_THRESHOLD`	`recall_threshold`	`400.0`	L2 cutoff for /recall candidates
`COGITO_QUERY_THRESHOLD`	`query_threshold`	`250.0`	L2 cutoff for /query results
`COGITO_FLAGSHIP_ENDPOINT`	`flagship_endpoint`	—	OpenAI-compatible base URL for flagship rerank (recall_hybrid tier="flagship")
`COGITO_FLAGSHIP_TOKEN`	`flagship_token`	—	Bearer token for flagship endpoint
`COGITO_FLAGSHIP_MODEL`	`flagship_model`	—	Flagship model name (e.g. `qwen-max`)
`COGITO_FLAGSHIP_TIMEOUT_MS`	`flagship_timeout_ms`	`30000`	Flagship LLM timeout in ms
`DASHSCOPE_API_KEY`	—	—	If set, recall_hybrid auto-configures flagship to DashScope qwen-max
`COGITO_HYBRID_COSINE_WEIGHT`	`hybrid_cosine_weight`	`0.7`	Cosine vs RRF blend weight for hybrid retrieval (0..1)

filter_endpoint accepts any OpenAI-compatible API: Anthropic gateway, LM Studio, Ollama's /v1 compat layer, OpenClaw, etc.

For Ollama qwen3/qwen3.5 models used as filter, cogito automatically switches to the native Ollama /api/chat endpoint with think: false to suppress thinking mode.

Python API

from cogito.recall import recall
from cogito.recall_b import recall_b
from cogito.recall_hybrid import recall_hybrid
from cogito.config import load, mem0_config
from mem0 import Memory

cfg = load()  # reads .cogito.json + env vars
memory = Memory.from_config(mem0_config(cfg))

# Two-stage recall (recommended default)
memories, method = recall(memory, "auth architecture", user_id=cfg["user_id"], cfg=cfg)
for m in memories:
    print(m["text"])  # verbatim stored text, never rephrased

# Zero-LLM recall (fast path)
memories, method = recall_b(memory, "auth architecture", user_id=cfg["user_id"], cfg=cfg)

# Hybrid recall (BM25 + dense + RRF + tiered LLM; 93.4% R@1 on LongMemEval_S)
memories, method = recall_hybrid(
    memory, "auth architecture", user_id=cfg["user_id"], cfg=cfg,
    tier="filter",  # "zero_llm" | "filter" | "flagship"
)

print(method)  # e.g. "filter", "decompose_4_v", "hybrid_12_bm25|filter"

Why integers?

When the filter LLM outputs only [3, 7, 12]:

It cannot rephrase memory text — it never generates it
It cannot hallucinate new facts into the output
It cannot summarize two memories into one
An out-of-range integer is silently ignored; it cannot inject noise

Compare to asking the LLM to "return the relevant passages" — even with careful prompting, LLMs will reword, compress, or merge content. The integer-pointer pattern makes fidelity a structural property of the pipeline, not a prompt engineering goal.

The filter prompt:

Output ONLY a JSON array of integers, ordered from most to least relevant.
Examples: [1, 4, 7]   or   []

The server then picks candidates[i] — verbatim stored text — for each valid integer i. The path from storage to retrieval contains no text generation.

Setup Recommendations

Before benchmarking or deploying — run zer0lint first.

Ingestion quality directly limits retrieval quality. A poorly formatted memory store will underperform regardless of retrieval method. The technical extraction prompt baked into cogito's default config was validated to produce 0%→100% ingestion improvement via zer0lint diagnostics.

Session start pattern (agent integration):

# 1. Load snapshot into context once at session start
import urllib.request, json
resp = urllib.request.urlopen("http://127.0.0.1:19420/snapshot")
snapshot = json.loads(resp.read())["snapshot"]
# inject snapshot into system prompt or first user message

# 2. Query per-message via /recall
# 3. Write new facts via /store (agent-curated) or /add (extraction)

Calibrate for domain-specific vocabulary:

cogito calibrate  # reads your corpus, writes vocab_map to .cogito.json
# then restart server to pick up new vocab_map

Calibration builds a plain-English → technical term bridge. Example: "how fast" → ["latency", "throughput", "ms"]. Improves recall_b on domain-specific queries without adding LLM calls.

Built by Hermes Labs

cogito-ergo is part of the Hermes Labs AI agent tooling suite:

zer0lint — Memory extraction diagnostics. Run before benchmarking to verify store quality. The technical extraction prompt in cogito's default config was validated against zer0lint.
zer0dex — Dual-layer memory architecture pattern that cogito-ergo implements.
lintlang — Static linter for AI agent tool descriptions and prompts
Little Canary — Prompt injection detection
Suy Sideguy — Runtime policy enforcement for agents
cogito-ergo — Two-stage memory retrieval ← you are here

Roadmap

Pluggable vector backends (pgvector, Qdrant, LlamaIndex)
Pluggable extraction backends (non-Ollama)
Session flush utility (end-of-session seeding)
Benchmark harness as public CLI (cogito bench)
Streaming /recall response

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Apr 23, 2026

0.0.8

Mar 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cogito_ergo-0.3.0.tar.gz (1.4 MB view details)

Uploaded Apr 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cogito_ergo-0.3.0-py3-none-any.whl (61.7 kB view details)

Uploaded Apr 23, 2026 Python 3

File details

Details for the file cogito_ergo-0.3.0.tar.gz.

File metadata

Download URL: cogito_ergo-0.3.0.tar.gz
Upload date: Apr 23, 2026
Size: 1.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for cogito_ergo-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`66cd2dbd7a9f5e11bb67a66c216696a9e0143cbaa02b27bdd1a695e0ecd287dc`
MD5	`393f021b494067beadd0ed5e76a01ecd`
BLAKE2b-256	`300537ceecd1632cb1f9713eff689514c5b007cfb6616b54856436e1f3748f45`

See more details on using hashes here.

File details

Details for the file cogito_ergo-0.3.0-py3-none-any.whl.

File metadata

Download URL: cogito_ergo-0.3.0-py3-none-any.whl
Upload date: Apr 23, 2026
Size: 61.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for cogito_ergo-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d597bdc4675577b1a3c40ab2107c3b6f4d400a5f1a4a321bc0e6c22da2721057`
MD5	`d21f99a8d67f7842a1dffd3fbbcae6b7`
BLAKE2b-256	`d6941b63023b36bbc67bbe7281692a5727e8e1dd70cc2cf755f89c0c8562ef13`

See more details on using hashes here.

cogito-ergo 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

cogito-ergo

The Problem

Architecture

Benchmarks

Quick Start

Hybrid recall (93.4% R@1 on LongMemEval_S)

Tier tradeoffs

Quick start

Regression notice

Claude Code session memory

HTTP API

GET /health

GET /snapshot

POST /query

POST /recall

POST /recall_b

POST /recall_hybrid

POST /store

POST /add

CLI Reference

Configuration

Python API

Why integers?

Setup Recommendations

Built by Hermes Labs

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`GET /health`

`GET /snapshot`

`POST /query`

`POST /recall`

`POST /recall_b`

`POST /recall_hybrid`

`POST /store`

`POST /add`