Hierarchical, scope-gated codebase indexing and persistent memory system for AI agents

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

danieliser

These details have not been verified by PyPI

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
Topic
- Software Development :: Libraries

Project description

Tessera

Persistent codebase intelligence for autonomous AI agents. Tessera gives agents bottom-up file access and top-down code understanding — across every project they're authorized to touch, with security from the ground up.

The Problem

Persistent AI agents — orchestrators like AutoJack, task agents like OpenClaw — need to understand codebases the way a senior developer does. Not just "find this string in a file," but "what calls this function, across which projects, and what breaks if I change it?"

Today's agents burn context window and wall-clock time on repeated grep / find / cat cycles. They lose track of project structure between conversations. They can't safely delegate to sub-agents without leaking access to projects those agents shouldn't see. And they can't search documentation, config files, or assets alongside code.

What Tessera Does

Tessera indexes everything — code, documents, config files, media assets, binary files — into a structured, chunked, searchable database. It exposes that through 18 MCP tools that any agent can call. Responses come back in milliseconds, not seconds.

For orchestrator agents: Full system visibility. Register projects, group them into collections, search across all of them. Understand cross-project dependencies. Delegate scoped access to sub-agents via session tokens.

For task agents: Deep code intelligence within their authorized scope. Symbol lookup, reference tracing, impact analysis, document search — everything an IDE provides, but through tool calls.

For security: Deny-by-default scope gating. Sub-agents only see what the orchestrator explicitly grants. Credentials and secrets are blocked from indexing by un-negatable security patterns. No ambient access, no scope creep.

Code Intelligence

Symbol search — Functions, classes, methods, hooks by name or pattern
Reference tracing — Call graphs, imports, inheritance chains
Impact analysis — "What breaks if I change this?" — traced N levels deep
File context — Complete structural overview of any file in one call
Cross-project references — Track where project A's exports are used in project B

Document & Text Search

Chunked indexing — Files are split into focused, searchable chunks with metadata (by header, key path, or line group) — not stored as monolithic blobs
Code + docs unified — Query across everything, or filter by source type (code, asset, document)
Structural formats — PDF, Markdown (break-point scoring with distance decay), YAML/JSON (key-path chunking)
Markup — HTML/XML with tag stripping
Plaintext — .txt, .rst, .csv, .log, .ini, .cfg, .toml, config files, dotfiles

Media & Binary File Indexing

Asset discovery — Images, videos, audio, fonts, and archives are automatically discovered and indexed
Metadata extraction — Filename, path, MIME type, file size, and image dimensions (PNG, JPEG, GIF, BMP) — zero external dependencies
FTS5 searchable — Search for assets by name, category, format, or path components
Source type filtering — Filter search results to asset, code, or document via the source_type parameter
SVG dual-indexing — SVGs indexed as both searchable XML documents and image assets

Multi-Project Federation

Project collections — Group related projects (e.g., a plugin ecosystem) and query across them
Scope-gated access — Session tokens control what each agent can see. Orchestrators create scoped tokens for sub-agents.
Search-time federation — Data stays at project level, merged at query time. No duplication.

Security

Deny-by-default — No access without a valid session token
.tesseraignore — Per-project ignore config with .gitignore syntax
Two-tier ignore system — Security-critical patterns (.env*, *.pem, *credentials*) are locked and cannot be overridden by project config
trusted field — Search results from code are marked trusted; document content is marked untrusted so agents can handle prompt injection risk

Infrastructure

Fully embedded — SQLite + FAISS. No Docker, no daemons, no external servers
Incremental indexing — Git-aware, only re-indexes changed files
Schema migration — Versioned database schema with automatic upgrades
Drift adapter — Switch embedding models without re-indexing (Orthogonal Procrustes)

Supported Languages

PHP, TypeScript, JavaScript, Python, Swift — via tree-sitter grammars.

MCP Tools (18)

Search & Navigation

Tool	Purpose
`search`	Hybrid keyword + semantic search across code, documents, and assets (filterable by `source_type`)
`doc_search_tool`	Document-only search (filterable by format or `source_type`)
`symbols`	Look up functions, classes, methods by name/pattern/kind
`references`	Find all references to a symbol (calls, imports, extends)
`file_context`	Complete context for a file (symbols, refs, structure)
`impact`	Trace downstream impact of changing a symbol
`cross_refs`	Cross-project references to a symbol
`collection_map`	Overview of projects in a collection with stats

Administration

Tool	Purpose
`register_project`	Register a project for indexing
`reindex`	Trigger full or incremental re-index
`status`	Project indexing status and health
`drift_train`	Train embedding drift adapter for model migration

Access Control

Tool	Purpose
`create_scope_tool`	Create scoped session tokens for sub-agents
`revoke_scope_tool`	Revoke agent session tokens
`create_collection_tool`	Create a project collection
`add_to_collection_tool`	Add a project to a collection
`list_collections_tool`	List all collections
`delete_collection_tool`	Delete a collection

Quick Start

Requirements

Python 3.11+
uv (recommended) or pip

Install

git clone https://github.com/danieliser/tessera.git
cd tessera
uv sync

Run as MCP Server

Add to your .mcp.json:

{
  "mcpServers": {
    "tessera": {
      "command": "uv",
      "args": [
        "--directory", "/path/to/tessera",
        "run", "python", "-m", "tessera", "serve"
      ]
    }
  }
}

Lock to a specific project (single-project mode):

uv run python -m tessera serve --project /path/to/your/project

Embedding Setup (Optional)

Tessera works without embeddings (keyword search only via FTS5). For semantic search, point it at any local OpenAI-compatible embedding endpoint. The embedding dimension is auto-detected — no configuration needed.

Recommended: LM Studio with nomic-embed-text or any embedding model serving on /v1/embeddings.

Run Tests

uv run pytest tests/ -v

Architecture

MCP Server (stdio)
├── Scope Validator (session-based, deny-by-default)
├── Query Router (project / collection / global)
│   ├── Search (FTS5 keyword + FAISS semantic + RRF merge)
│   ├── Symbols / References / Impact (SQLite graph)
│   └── Document Search (source_type filtering)
├── Per-Project Indexes
│   ├── SQLite (symbols, references, edges, files, chunk_meta)
│   └── FAISS (vector embeddings)
├── Global SQLite (~/.tessera/global.db)
│   ├── projects, collections, sessions
│   └── indexing_jobs
└── Indexer Pipeline
    ├── Tree-sitter parser (PHP, TS, JS, Python, Swift)
    ├── AST-aware code chunking
    ├── Document extraction (PDF, MD, YAML, JSON, HTML, XML, plaintext)
    ├── Asset metadata extraction (images, video, audio, fonts, archives)
    └── Ignore filter (.tesseraignore, two-tier security)

Design Principles

No external dependencies at runtime — SQLite + FAISS, fully embedded
Tree-sitter for deterministic parsing — no LLM-extracted graphs, no hallucinated edges
Chunked everything — every file is split into focused, searchable units with structural metadata
Security-first scope model — deny-by-default, session-scoped, un-negatable credential protection
Federation over duplication — data stays at project level, merged at query time

Project Status

v0.6.0 — Hybrid search with semantic snippet scoring, PPR graph ranking, collapsed ancestry context, and stale index detection.

Phase	Status	What
1	Done	Single-project indexer + scoped MCP server
2	Done	Incremental indexing + persistence
3	Done	Collection federation + cross-project refs
4	Done	Document indexing + drift adapter + ignore config + text formats
4.5	Done	Media/binary file metadata catalog
5	Done	PPR graph ranking + semantic snippet scoring
6	Planned	Always-on file watcher

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

danieliser

These details have not been verified by PyPI

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
Topic
- Software Development :: Libraries

Release history Release notifications | RSS feed

0.12.1

Mar 18, 2026

0.10.1

Mar 8, 2026

0.10.0

Mar 8, 2026

0.9.0

Mar 7, 2026

0.8.0

Mar 3, 2026

0.7.2

Mar 3, 2026

This version

0.7.0

Mar 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tessera_idx-0.7.0.tar.gz (576.0 kB view details)

Uploaded Mar 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tessera_idx-0.7.0-py3-none-any.whl (102.7 kB view details)

Uploaded Mar 3, 2026 Python 3

File details

Details for the file tessera_idx-0.7.0.tar.gz.

File metadata

Download URL: tessera_idx-0.7.0.tar.gz
Upload date: Mar 3, 2026
Size: 576.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tessera_idx-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`817eed5cc24a4db417da007fb2ea4bbbe8b1261f43ed39f5f1b903ac95d65606`
MD5	`1be470564ab9a5d7a72f0c45daa8aa9e`
BLAKE2b-256	`93f79dd1cc466a68d9764a0965893a7b5529cefb3a5bfe7dc4077a85ee9a2b7c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tessera_idx-0.7.0.tar.gz:

Publisher: publish.yml on danieliser/tessera

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tessera_idx-0.7.0.tar.gz
- Subject digest: 817eed5cc24a4db417da007fb2ea4bbbe8b1261f43ed39f5f1b903ac95d65606
- Sigstore transparency entry: 1019569190
- Sigstore integration time: Mar 3, 2026
Source repository:
- Permalink: danieliser/tessera@10751c95f8beef05fb9fb34a905f6cc791987420
- Branch / Tag: refs/tags/v0.7.1
- Owner: https://github.com/danieliser
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@10751c95f8beef05fb9fb34a905f6cc791987420
- Trigger Event: push

File details

Details for the file tessera_idx-0.7.0-py3-none-any.whl.

File metadata

Download URL: tessera_idx-0.7.0-py3-none-any.whl
Upload date: Mar 3, 2026
Size: 102.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tessera_idx-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e507b2fa72ad449d464b5477fcdc82b91bc024a0c3878db5245dbe8f1e7014bd`
MD5	`7bb13141dd6cfc6ae1bae5a0d438f8af`
BLAKE2b-256	`2f167513d8abf73bd3f50e926c465e74a2eac0440046b6820bcd02e01b0906ba`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tessera_idx-0.7.0-py3-none-any.whl:

Publisher: publish.yml on danieliser/tessera

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tessera_idx-0.7.0-py3-none-any.whl
- Subject digest: e507b2fa72ad449d464b5477fcdc82b91bc024a0c3878db5245dbe8f1e7014bd
- Sigstore transparency entry: 1019569227
- Sigstore integration time: Mar 3, 2026
Source repository:
- Permalink: danieliser/tessera@10751c95f8beef05fb9fb34a905f6cc791987420
- Branch / Tag: refs/tags/v0.7.1
- Owner: https://github.com/danieliser
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@10751c95f8beef05fb9fb34a905f6cc791987420
- Trigger Event: push

tessera-idx 0.7.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Tessera

The Problem

What Tessera Does

Code Intelligence

Document & Text Search

Media & Binary File Indexing

Multi-Project Federation

Security

Infrastructure

Supported Languages

MCP Tools (18)

Search & Navigation

Administration

Access Control

Quick Start

Requirements

Install

Run as MCP Server

Embedding Setup (Optional)

Run Tests

Architecture

Design Principles

Project Status

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance