Skip to main content

Software Cognition Engine — MCP-native context capsules for LLM coding agents.

Project description

Cognis logo

cognis

Software Cognition Engine for MCP clients and coding agents.

Status: v0.3.2 beta — see CHANGELOG.md and docs/release-notes-v0.3.0.md.

cognis is a local indexing and retrieval system for source code. It builds a workspace database from your repository and exposes structured queries such as symbol lookup, semantic search, dependency tracing, and task-oriented context retrieval to MCP-compatible tools.

What makes it different: CSAR

Most AI code tools (Cursor, Cody, and earlier versions of cognis) rank code by embedding KNN + BM25, scoring each symbol independently. That misses the flow of code: a function on the call path between two relevant symbols is invisible if it has no direct keyword or embedding match.

cognis is built around CSAR — Code Spreading-Activation Retrieval. CSAR seeds a relevance distribution from cheap lexical + semantic matches, then diffuses it across the code knowledge graph using Personalized PageRank (random walk with restart). Results:

  • Recovers full flow. On-path callers/callees surface even with zero direct match — solving the "missing structure" failure of pure embedding search.
  • Repo-size-independent cost. The forward-push solver has provable work bound 1/(α·ε), independent of repository size — so improving recall does not mean more greping/embedding as the codebase grows.
  • One tunable operator. A single parameter α provably interpolates between pure semantic (α→1) and pure structural (α→0) retrieval.

CSAR is grounded in five theorems (existence/uniqueness, geometric convergence, mass conservation, endpoint limits, and the forward-push cost bound), each verified in code by unit and property-based tests. See docs/csar.md for the math and proofs.

The flagship MCP tool is diffuse_context, which returns the unified ranked shortlist in a single round trip — replacing separate discover_symbols + dependency_trace calls.

What it does

cognis is useful when file-level search is not enough. Instead of returning only raw files, it stores code structure and retrieval metadata so clients can request focused context about symbols, relationships, and likely problem areas.

Repo layout

cognis/
├── apps/
│   ├── cognis-cli/        # Click-based CLI: init, index, eval, health, up/down
│   ├── cognis-mcpd/       # FastMCP server (stdio at MVP, SSE in Phase 2)
│   ├── cognis-indexd/     # Indexer daemon: watcher → parser → enricher → embedder → writer
│   └── cognis-vscode/     # VS Code / Cursor extension
├── packages/
│   ├── core/              # data model, planner, capsule composer, schemas
│   ├── retrieval/         # CSAR diffusion + lexical/semantic/structural layers
│   ├── indexer/           # parsers, resolvers, enrichers, embedders, writer
│   ├── adapters/          # git, lsp, otel (phase 3)
│   └── eval/              # golden-set runner, metrics, reports
├── tests/
│   ├── unit/              # fast, in-process
│   ├── integration/       # cross-process, fixture-repo
│   ├── pbt/               # hypothesis property-based tests (CP-1..CP-12)
│   ├── eval/              # slow nightly eval
│   └── fixtures/repos/    # mini-ts-app, mini-py-svc, mini-go-svc
├── docs/
└── .cognis/               # gitignored runtime dir (created by `cognis-cli init`)

Current scope

Area Status
Indexer (TS / Python / Go) Implemented
CSAR diffusion retrieval (Personalized PageRank) Implemented — primary engine
Retrieval (lexical, semantic, structural) Implemented (CSAR seed/fallback layers)
MCP server (8 tools, stdio) Implemented
CLI: init, bootstrap, paths, mcp-config, index, eval, health, mcp-conformance Implemented
VS Code / Cursor extension Implemented (apps/cognis-vscode)
CLI: up, down Docker Compose wrappers (deploy/compose.yaml)
CLI: profile Stub — use make bench for latency tests
LSP resolver Detection only; heuristic fallback for edges
PyPI publish Not yet — install from source

Full release notes: docs/release-notes-v0.3.0.md.

Quick start

Requirements: Python ≥ 3.11 and Git. That's it.

Step 1 — Install (all platforms)

git clone https://github.com/buimanhtoan-it/cognis
cd cognis
python -m venv .venv

Activate the virtual environment:

Platform Command
macOS / Linux source .venv/bin/activate
Windows PowerShell .\.venv\Scripts\Activate.ps1

Install the backend (one command):

python -m pip install -e ".[indexer,embed-local,vector,tokenizers,mcp]"

Step 2 — Pick how you want to use it

Option A · Editor (VS Code / Cursor) — recommended

python scripts/setup_extension.py --package

Then in your editor: install the generated .vsix, select the same Python interpreter you used above, open your project, and run the command Cognis: Set Up for AI. The extension writes the MCP config and starts indexing for you. Reload the editor if the tools don't appear right away.

Option B · CLI / terminal

cd /path/to/your/project
cognis-cli bootstrap .      # init + index + health, in one command
cognis-mcpd                 # start the MCP server (stdio)

That's it — your repo is indexed and the MCP tools are live. Point any MCP-compatible client at cognis-mcpd (see docs/mcp-client-config.md).

If cognis-cli / cognis-mcpd aren't on your PATH, use the module form: python -m cognis.cli.main bootstrap . and python -m cognis_mcpd.main.

Re-index from scratch

Wiped state or stale index? Reset and rebuild while keeping your config:

cognis-cli index --clear .

In the editor, use the Clear & Re-index button in the Cognis panel.

Next steps

  • Getting started — fresh machine → working editor setup
  • Quickstart — CLI-focused walkthrough and first query
  • Contributor setup: make install-dev (or .\scripts\setup-dev.ps1 / ./scripts/setup-dev.sh / invoke install-dev)

Development workflow

Command What it runs
make lint ruff format --check + ruff check
make typecheck mypy (strict on packages/core)
make test pytest unit + property tests
make bench pytest --benchmark-only
make eval golden-set runner (cognis-cli eval)

tasks.py exposes the same recipes for environments without make (Windows in particular):

invoke lint typecheck test

Platform notes

  • Python >= 3.11.
  • Tree-sitter grammars are vendored or downloaded as part of the dev bootstrap; CI caches them.

sqlite-vec extension

cognis uses sqlite-vec for the semantic retrieval layer (KNN over symbol_vec). The Python wheel pulls in a prebuilt native extension; installation is one command on all three supported platforms:

pip install cognis-engine[vector]
# or directly:
pip install sqlite-vec
Platform Notes
Linux (x86_64, aarch64) Prebuilt wheel ships with the extension .so. Requires glibc >= 2.17. No additional steps.
macOS (x86_64, arm64) Prebuilt wheel ships with the extension .dylib. Requires macOS 11+. No additional steps.
Windows (x86_64) Prebuilt wheel ships with vec0.dll. Python must be built with extension-loading enabled (the official python.org installer is). If you use a stripped Python build (some corporate distributions), install a stock CPython and retry.

When the extension cannot be loaded for any reason, cognis falls back to a plain symbol_vec(symbol_id PK, embedding BLOB) table. The indexer still writes embeddings; only KNN queries are unavailable until the extension is restored. cognis-cli health reports the active backend.

To verify the extension is loaded:

python -c "import sqlite3, sqlite_vec; c=sqlite3.connect(':memory:'); c.enable_load_extension(True); sqlite_vec.load(c); print(c.execute('select vec_version()').fetchone())"

A successful run prints something like ('v0.1.6',).

Self-hosted deployment

For a Docker Compose deployment:

export WORKSPACE_HOST_PATH=/path/to/your/codebase
docker compose -f deploy/compose.yaml up -d

See docs/operations.md for init, indexing, health, and upgrades.

Security model in one screen

  • Every comment / docstring / PR body is treated as untrusted and tagged before reaching the LLM.
  • Secret-shaped strings (API keys, JWTs, PEM headers, password=) are scrubbed before indexing — originals are never persisted.
  • MCP tools have hard caps on depth, k, wall time, and concurrent requests; every call is logged to .cognis/audit.log with hashed args.

Full threat model: docs/security.md.

License

Apache-2.0. See LICENSE.

Project links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cognis_engine-0.3.2.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cognis_engine-0.3.2-py3-none-any.whl (518.2 kB view details)

Uploaded Python 3

File details

Details for the file cognis_engine-0.3.2.tar.gz.

File metadata

  • Download URL: cognis_engine-0.3.2.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for cognis_engine-0.3.2.tar.gz
Algorithm Hash digest
SHA256 8725385e9b334f1e1626cdca537f2f3b3084484abb4626f410767c662ce2c9e0
MD5 0fa157ae5d294b875fd7a317772f89f5
BLAKE2b-256 55739dcebc2e9c7b8970758ee0bb0026e51a0df3b31b4709e09a7fa6cdc2cc28

See more details on using hashes here.

File details

Details for the file cognis_engine-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: cognis_engine-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 518.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for cognis_engine-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 568dd6e28702e2eed240009b68f0c6f6e39106602a7b559be0f1b64d1716247f
MD5 05931f511e5412743a9076d1d9751599
BLAKE2b-256 16827b0e5c7b08b5103da35b2b6ce4c0407bd28a2b1de5e9ec384d4076f89e9c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page