Commit-aware code context manager for LLMs - MCP server and CLI

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

codetex-mcp

A commit-aware code context manager for LLMs. Indexes Git repositories into a multi-tier knowledge hierarchy — repo overviews, file summaries, and symbol details — stored in SQLite with vector search. Serves context to LLM clients via the Model Context Protocol (MCP) or a local CLI.

What It Does

codetex builds a structured, searchable index of your codebase that LLMs can query on demand:

Tier 1 — Repo Overview: Purpose, architecture, directory structure, key technologies, entry points
Tier 2 — File Summaries: Per-file purpose, public interfaces, dependencies, roles
Tier 3 — Symbol Details: Function/class signatures, parameters, return types, call relationships

Summaries are generated by an LLM (Anthropic Claude). Embeddings are computed locally with sentence-transformers for semantic search. Everything is stored in a single SQLite database with sqlite-vec for vector queries.

Incremental sync means only changed files are re-analyzed when you update your code.

Requirements

Python 3.12+
Git
An Anthropic API key (for indexing)

Installation

# With pip
pip install codetex-mcp

# With uv (recommended)
uv tool install codetex-mcp

Quick Start

1. Set your Anthropic API key

# Via environment variable
export ANTHROPIC_API_KEY=sk-ant-...

# Or via config
codetex config set llm.api_key sk-ant-...

2. Add a repository

# Local repo
codetex add /path/to/your/project

# Remote repo (clones to ~/.codetex/repos/)
codetex add https://github.com/user/repo.git

3. Index it

# Preview what indexing will cost (no API calls)
codetex index my-project --dry-run

# Build the full index
codetex index my-project

4. Query your codebase

# Repo overview (Tier 1)
codetex context my-project

# File summary (Tier 2)
codetex context my-project --file src/auth/login.py

# Symbol detail (Tier 3)
codetex context my-project --symbol authenticate_user

# Semantic search
codetex context my-project --query "how is authentication implemented?"

5. Keep it up to date

# Incremental sync — only re-analyzes changed files
codetex sync my-project

MCP Server Setup

The MCP server lets LLM clients (like Claude Code, Cursor, Windsurf, etc.) query your indexed codebases directly.

Claude Code

Add to your Claude Code MCP settings (~/.claude/claude_desktop_config.json):

{
  "mcpServers": {
    "codetex": {
      "command": "codetex",
      "args": ["serve"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

If you installed with uv tool, use the full path:

{
  "mcpServers": {
    "codetex": {
      "command": "/path/to/codetex",
      "args": ["serve"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

Find the path with which codetex or uv tool dir.

Other MCP Clients

Any client that supports MCP stdio transport can use codetex. The server command is:

codetex serve

Available MCP Tools

Once connected, the LLM has access to 7 tools:

Tool	Description
`get_repo_overview`	Tier 1 repo overview (architecture, technologies, entry points)
`get_file_context`	Tier 2 file summary with symbol list
`get_symbol_detail`	Tier 3 full symbol detail (signature, params, relationships)
`search_context`	Semantic search across all indexed context
`get_repo_status`	Index status (staleness, file/symbol counts, last indexed)
`sync_repo`	Trigger incremental sync from within the LLM session
`list_repos`	List all registered repositories

CLI Reference

`codetex add <target>`

codetex add .                                    # Current directory
codetex add /path/to/repo                        # Local path
codetex add https://github.com/user/repo.git     # Remote (clones locally)
codetex add git@github.com:user/repo.git         # SSH remote

`codetex index <repo-name>`

Build a full index for a registered repository.

codetex index my-project                # Full index
codetex index my-project --dry-run      # Preview (files, symbols, estimated LLM calls/tokens)
codetex index my-project --path src/    # Index only files under src/

`codetex sync <repo-name>`

Incremental sync to the current HEAD. Only files changed since the last indexed commit are re-analyzed.

codetex sync my-project                 # Sync changes
codetex sync my-project --dry-run       # Preview what would change
codetex sync my-project --path src/     # Sync only changes under src/

`codetex context <repo-name>`

Query indexed context at any tier.

codetex context my-project                              # Tier 1: repo overview
codetex context my-project --file src/main.py           # Tier 2: file summary
codetex context my-project --symbol MyClass             # Tier 3: symbol detail
codetex context my-project --query "error handling"     # Semantic search

`codetex status <repo-name>`

Show index status: indexed commit, current HEAD, staleness, file/symbol counts, token usage.

`codetex list`

List all registered repositories with their index status.

`codetex config show`

Display the current configuration.

`codetex config set <key> <value>`

Update a configuration value.

codetex config set llm.api_key sk-ant-...
codetex config set llm.model claude-sonnet-4-5-20250929
codetex config set indexing.max_file_size_kb 1024
codetex config set indexing.max_concurrent_llm_calls 10

Configuration

Configuration is loaded in layers (last wins):

Defaults — sensible out-of-the-box values
TOML file — ~/.codetex/config.toml
Environment variables — override everything

Config file

# ~/.codetex/config.toml

[storage]
data_dir = "~/.codetex"                  # Base directory for DB and cloned repos

[llm]
provider = "anthropic"                   # LLM provider (currently: anthropic)
model = "claude-sonnet-4-5-20250929"     # Model used for summarization
api_key = "sk-ant-..."                   # Anthropic API key

[indexing]
max_file_size_kb = 512                   # Skip files larger than this
max_concurrent_llm_calls = 5             # Parallel LLM requests during indexing
tier1_rebuild_threshold = 0.10           # Rebuild repo overview if >=10% of files changed on sync

[embedding]
model = "all-MiniLM-L6-v2"              # Sentence-transformers model for embeddings

Environment variables

Variable	Maps to	Example
`ANTHROPIC_API_KEY`	`llm.api_key`	`sk-ant-...`
`CODETEX_DATA_DIR`	`storage.data_dir`	`/custom/path`
`CODETEX_LLM_PROVIDER`	`llm.provider`	`anthropic`
`CODETEX_LLM_MODEL`	`llm.model`	`claude-sonnet-4-5-20250929`
`CODETEX_MAX_FILE_SIZE_KB`	`indexing.max_file_size_kb`	`1024`
`CODETEX_MAX_CONCURRENT_LLM`	`indexing.max_concurrent_llm_calls`	`10`
`CODETEX_TIER1_THRESHOLD`	`indexing.tier1_rebuild_threshold`	`0.15`
`CODETEX_EMBEDDING_MODEL`	`embedding.model`	`all-MiniLM-L6-v2`

File Exclusion

Files are filtered through multiple stages:

Default excludes — node_modules/, __pycache__/, .git/, dist/, build/, .venv/, *.lock, *.min.js, *.pyc, *.so, etc.
.gitignore — standard gitignore rules from your repo
.codetexignore — same syntax as .gitignore, placed in your repo root. Use !pattern to un-ignore files
File size — files exceeding max_file_size_kb are skipped
Binary detection — files with null bytes in the first 8 KB are skipped

Language Support

Language	Tree-sitter (full AST)	Fallback (regex)
Python	Yes	Yes
JavaScript	Yes	Yes
TypeScript	Yes	Yes
Go	Yes	Yes
Rust	Yes	Yes
Java	Yes	Yes
Ruby	Yes	Yes
C/C++	Yes	Yes
All others	—	Yes

Tree-sitter grammars for all 8 languages are installed automatically. For other languages, the fallback parser uses regex patterns to extract functions, classes, and imports.

Architecture

CLI (Typer) ──┐
              ├──▶ Core Services (Indexer, Syncer, ContextStore, SearchEngine)
MCP (FastMCP)─┘         │              │              │
                    Analysis        LLM Provider    Embeddings
                 (tree-sitter +    (Anthropic)    (sentence-transformers)
                  regex fallback)       │              │
                         └──────────────┴──────────────┘
                                        │
                                   SQLite + sqlite-vec

Two entry points (CLI and MCP server) share the same core service layer
No DI framework — services are wired via a create_app() factory
All core services are async — CLI bridges with asyncio.run()
Embeddings are local — no external API calls for vector search (model auto-downloads on first run, ~90 MB)
Single SQLite database — 6 main tables + 2 vector tables (384-dimensional embeddings)

Development

git clone https://github.com/mrosata/codetex-mcp.git
cd codetex-mcp

# Install dependencies (including dev)
uv sync

# Run tests
uv run pytest

# Run tests with coverage
uv run pytest --cov=codetex_mcp

# Lint and format
uv run ruff check src/ tests/
uv run ruff format src/ tests/

# Type check
uv run mypy src/

Releasing

Releases are automated via GitHub Actions and python-semantic-release. Version bumps are driven by conventional commit messages on main.

Commit message format

Prefix	Effect	Example
`fix: ...`	Patch bump (0.1.0 → 0.1.1)	`fix: handle missing gitignore`
`feat: ...`	Minor bump (0.1.0 → 0.2.0)	`feat: add Ruby tree-sitter support`
`feat!: ...`	Major bump (0.1.0 → 1.0.0)	`feat!: redesign context API`
`docs:`, `chore:`, `ci:`, `test:`, `refactor:`	No release	`docs: update README`

A BREAKING CHANGE: line in the commit body also triggers a major bump.

How it works

Push or merge a PR to main
CI runs lint, type check, and tests
The release workflow analyzes commits since the last tag
If a version bump is needed, it:
- Updates the version in pyproject.toml
- Creates a git tag (e.g., v0.2.0)
- Publishes a GitHub Release with a changelog
- Builds and publishes the package to PyPI

Manual release (not recommended)

If you need to release without the automation:

uv build
uv publish

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mrosata

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.5.0

Mar 31, 2026

0.4.0

Mar 31, 2026

This version

0.3.0

Mar 30, 2026

0.2.3

Mar 30, 2026

0.2.2

Mar 30, 2026

0.2.1

Mar 30, 2026

0.2.0

Mar 30, 2026

0.1.3

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codetex_mcp-0.3.0.tar.gz (270.6 kB view details)

Uploaded Mar 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codetex_mcp-0.3.0-py3-none-any.whl (57.0 kB view details)

Uploaded Mar 30, 2026 Python 3

File details

Details for the file codetex_mcp-0.3.0.tar.gz.

File metadata

Download URL: codetex_mcp-0.3.0.tar.gz
Upload date: Mar 30, 2026
Size: 270.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for codetex_mcp-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`da2df17f0db61fc8c421127f96bec9629721088fba89352cf12b404029fedc24`
MD5	`c1aa3eedd465063d68ddf971497c81c8`
BLAKE2b-256	`e472289c7fc4f8e697a156619625729f0c0bbafc0be90d9a55c2fd8f4f526068`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codetex_mcp-0.3.0.tar.gz:

Publisher: release.yml on mrosata/codetex-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codetex_mcp-0.3.0.tar.gz
- Subject digest: da2df17f0db61fc8c421127f96bec9629721088fba89352cf12b404029fedc24
- Sigstore transparency entry: 1201178377
- Sigstore integration time: Mar 30, 2026
Source repository:
- Permalink: mrosata/codetex-mcp@7b7f173eeaf06849b4b60ebe34e769a501e57f7b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/mrosata
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7b7f173eeaf06849b4b60ebe34e769a501e57f7b
- Trigger Event: push

File details

Details for the file codetex_mcp-0.3.0-py3-none-any.whl.

File metadata

Download URL: codetex_mcp-0.3.0-py3-none-any.whl
Upload date: Mar 30, 2026
Size: 57.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for codetex_mcp-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a2bb1f8595591c0877196d1fab8d842ace415b878bbf4cda832b9212350042a0`
MD5	`07566f761b950e09671b6345c40d519f`
BLAKE2b-256	`ca0558a3d8e57dda9cae871659803e2c3ec951e2751d1b4da8de705b71994307`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codetex_mcp-0.3.0-py3-none-any.whl:

Publisher: release.yml on mrosata/codetex-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codetex_mcp-0.3.0-py3-none-any.whl
- Subject digest: a2bb1f8595591c0877196d1fab8d842ace415b878bbf4cda832b9212350042a0
- Sigstore transparency entry: 1201178393
- Sigstore integration time: Mar 30, 2026
Source repository:
- Permalink: mrosata/codetex-mcp@7b7f173eeaf06849b4b60ebe34e769a501e57f7b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/mrosata
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7b7f173eeaf06849b4b60ebe34e769a501e57f7b
- Trigger Event: push

codetex-mcp 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

codetex-mcp

What It Does

Requirements

Installation

Quick Start

1. Set your Anthropic API key

2. Add a repository

3. Index it

4. Query your codebase

5. Keep it up to date

MCP Server Setup

Claude Code

Other MCP Clients

Available MCP Tools

CLI Reference

codetex add <target>

codetex index <repo-name>

codetex sync <repo-name>

codetex context <repo-name>

codetex status <repo-name>

codetex list

codetex config show

codetex config set <key> <value>

Configuration

Config file

Environment variables

File Exclusion

Language Support

Architecture

Development

Releasing

Commit message format

How it works

Manual release (not recommended)

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`codetex add <target>`

`codetex index <repo-name>`

`codetex sync <repo-name>`

`codetex context <repo-name>`

`codetex status <repo-name>`

`codetex list`

`codetex config show`

`codetex config set <key> <value>`