MCP server that performs tree-sitter-based source code analysis.
Project description
deco-assaying
MCP server that performs tree-sitter-based source code analysis. Designed to feed structural information about a repo (symbols, imports, references, chunks, metrics) into a downstream consumer that maintains a knowledge base over many codebases.
Run
Pick the deployment that matches your situation:
| Mode | Command | When to use |
|---|---|---|
| Daemon — pinned install | uv tool install |
You'll run it across many sessions; want it on $PATH. |
| Daemon — ephemeral | uvx |
One-off run; don't want anything left on disk. |
| Container | docker run from GHCR |
Ops deployment, compose stack, or want filesystem isolation. |
| From source | uv run |
Hacking on the server itself. |
Prereqs
-
uv-based modes need
uvandgit. uv ships a portable Python 3.13, so no system Python install required.curl -LsSf https://astral.sh/uv/install.sh | sh
-
Docker mode needs
docker(or compatible). The image bundles Python 3.13 and git; nothing else on the host.
1. Daemon — uv tool install (PyPI)
Installs the deco-assaying command on your $PATH, isolated in its
own venv that uv manages.
uv tool install deco-assaying
deco-assaying # starts the server
Update later with uv tool upgrade deco-assaying; remove with
uv tool uninstall deco-assaying.
2. Daemon — uvx (no install)
uvx resolves the package into a temporary venv and runs the entry
point in one shot. Nothing persists between runs.
uvx deco-assaying # latest release
uvx deco-assaying@0.1.0 # pin a specific version
Good for kicking the tires or running on a CI box where you don't
want to touch ~/.local/share/uv.
3. Docker / GHCR
Pull and run the published multi-arch image (linux/amd64 + linux/arm64):
docker pull ghcr.io/garycoding/deco-assaying:latest
docker run --rm \
-p 35832:35832 \
-v deco-assaying-data:/data \
ghcr.io/garycoding/deco-assaying:latest
Pin a specific version with a tag — :0.1.0, :0.1, or :latest
(see the Releases
page on GHCR for the available tags).
Or with compose (see docker-compose.yml — pulls
the image, mounts a named volume at /data, restarts on failure):
docker compose up -d
The named volume deco-assaying-data persists job outputs across
container restarts. To pass auth tokens for private repos:
docker run --rm \
-e GITHUB_TOKEN=ghp_... \
-e GITLAB_TOKEN=glpat-... \
-p 35832:35832 \
-v deco-assaying-data:/data \
ghcr.io/garycoding/deco-assaying:latest
4. From source
git clone https://github.com/garycoding/deco-assaying.git
cd deco-assaying
uv sync
uv run python -m deco_assaying
Endpoints
In every mode the server listens on PORT (default 35832) with:
POST /sse— MCP Streamable HTTP transport.GET /health— liveness probe.GET /admin/*— read-only JSON ops endpoints.GET /outputs/{job_id}/...— read-only download API for job artifacts.GET /docs— OpenAPI / Swagger UI for the HTTP API.
Sanity-check it's up:
curl http://127.0.0.1:35832/health
MCP tools
analyze_file(content, filename?, language?, options?)— parse a single file passed inline; returns structural JSON.index_repo(source, options?)— start a job that indexes a whole repo and writes per-file artifacts plus a manifest. The server allocates a fresh output dir underOUTPUT_ROOTand returns{ job_id, output_path }.sourcecan be a local directory, a GitHub URL (https://github.com/owner/repo), or a GitLab URL (https://gitlab.com/owner/repo, including nested groupshttps://gitlab.com/group/sub/repo). Passgit_refto pick a specific branch / tag / sha.get_job_status(job_id)— poll a running or completed job.cancel_job(job_id)— cooperative cancel.list_supported_languages()— capability discovery.detect_language(path)— extension/shebang detection helper.
Output download API
Every job's artifacts land under OUTPUT_ROOT/{job_id}/. A consumer
sharing the volume can read them off disk; one without a shared volume
can pull them over HTTP:
| Endpoint | Returns |
|---|---|
GET /outputs/{job_id} |
manifest.json (convenience). |
GET /outputs/{job_id}/manifest.json |
Repo-level rollup. |
GET /outputs/{job_id}/tree.json |
Full path inventory (analyzed + skipped). |
GET /outputs/{job_id}/symbols.json |
Global qualified-name index. |
GET /outputs/{job_id}/languages.json |
Per-language counts. |
GET /outputs/{job_id}/errors.json |
Parse errors + skipped files. |
GET /outputs/{job_id}/log.jsonl?from_offset=N |
Tail the job's log. |
GET /outputs/{job_id}/ls?path=&recursive= |
Directory listing. |
GET /outputs/{job_id}/file/{path} |
Single file, or a streaming ZIP if any path segment contains *?[. E.g. /file/files/**/*.py.json. |
GET /outputs/{job_id}/zip?path=&match= |
Explicit-bulk-zip alias. Default = whole job dir. |
DELETE /outputs/{job_id} |
Remove the dir + drop the table entry. 409 if still running. |
GET /admin/outputs |
List every job_id present on disk under OUTPUT_ROOT. |
Path traversal (.., absolute paths, escape via symlink) is rejected.
Resource requirements
When index_repo runs against a GitHub URL, the server uses a partial
clone with bin-packed batched fetching. That gives a small, predictable
disk footprint regardless of how large the source repo is:
-
Source-side scratch space: ~100 MB peak in
output_path/.source/during analysis. The server fetches each batch of source files (totaling ≤max_partial_clone_bytes, default 100 MB), analyzes them, deletes them from the working tree, then fetches the next batch. Even on a multi-GB monorepo, peak local-disk used for source content stays at ~100 MB. Tunable via themax_partial_clone_bytesoption onindex_repo. -
Output artifacts: roughly 1-2× the analyzed-source size. Each analyzed file produces a JSON artifact under
output_path/files/containing symbols, imports, references, chunks, etc. These persist past the job — the consumer reads them incrementally — and are the largest durable footprint. The retention sweeper auto-purges job dirs older thanOUTPUT_EXPIRY_DAYS. -
Memory: modest. A
ProcessPoolExecutorruns roughly2 × CPU countworkers, each holding one file's bytes plus its tree-sitter parse tree in memory. Source files are capped atmax_file_bytes(default 2 MB), so worst case is ~16-32 MB of resident source + parse trees on a typical 8-core box. -
Network: one provider-API pre-flight to plan the batches (GitHub Trees REST or GitLab REST tree + GraphQL; free for public repos, set
GITHUB_TOKEN/GITLAB_TOKENfor higher quotas and private-repo access), plus onegit fetch-packround-trip per batch. For a typical sub-100 MB repo that's two HTTP hits total.
For local-path sources nothing is fetched and nothing is cloned — the only on-disk cost is the output artifacts.
Configuration
| Env var | Default (daemon) | Default (container) | Purpose |
|---|---|---|---|
PORT |
35832 |
35832 |
HTTP listen port. |
HOST |
0.0.0.0 |
0.0.0.0 |
HTTP bind address. |
OUTPUT_ROOT |
./output |
/data |
Where the server writes job dirs. |
OUTPUT_EXPIRY_DAYS |
7 |
7 |
Auto-purge job dirs older than this. 0 disables. |
JOB_HISTORY_MAX |
100 |
100 |
In-memory job-table cap. |
DEFAULT_MAX_FILE_BYTES |
2097152 |
2097152 |
Default per-file size cap. |
DEFAULT_CHUNK_MAX_TOKENS |
800 |
800 |
Default chunk size for cAST chunking. |
GITHUB_TOKEN |
unset | unset | Optional, raises GitHub Trees API quota from 60 to 5000 req/hr and unlocks private repos. |
GITLAB_TOKEN |
unset | unset | Optional, used for GitLab API auth and private-repo access. |
Releasing
Tag-driven. Bump version in pyproject.toml, then:
git tag vX.Y.Z && git push --tags
The Release workflow builds a multi-arch image (linux/amd64 +
linux/arm64) and pushes it to GHCR with vX.Y.Z, vX.Y, and latest
tags, in parallel with publishing wheel + sdist to PyPI via trusted
publishing. ~3-5 minutes end-to-end.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deco_assaying-0.1.3.tar.gz.
File metadata
- Download URL: deco_assaying-0.1.3.tar.gz
- Upload date:
- Size: 142.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a89beae35ed5e58e7459e24299fc20f0c2619fddc633f8ee03c0d5df835fb64
|
|
| MD5 |
d58f426209fb71ad92628b39e7926fdb
|
|
| BLAKE2b-256 |
dceb2d55b00c6128e7905f5cc5ad69685d51007f50aa2ae4e5ddce60114e15d6
|
Provenance
The following attestation bundles were made for deco_assaying-0.1.3.tar.gz:
Publisher:
release.yml on garycoding/deco-assaying
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
deco_assaying-0.1.3.tar.gz -
Subject digest:
3a89beae35ed5e58e7459e24299fc20f0c2619fddc633f8ee03c0d5df835fb64 - Sigstore transparency entry: 1436821453
- Sigstore integration time:
-
Permalink:
garycoding/deco-assaying@9f1cab307fa2ad911806261c9abebf4200459be9 -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/garycoding
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9f1cab307fa2ad911806261c9abebf4200459be9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file deco_assaying-0.1.3-py3-none-any.whl.
File metadata
- Download URL: deco_assaying-0.1.3-py3-none-any.whl
- Upload date:
- Size: 91.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe3ad7f510660e8b4292c8431eb71632cffebf887dfaf21f6cbed1b84ae634ba
|
|
| MD5 |
b0d5c8014ba044bad6dfc89468ce37c7
|
|
| BLAKE2b-256 |
37ea7bd88ae97eb6e015b7f2de2c829d3739670c9ade9661bd1f3ba7e1ccd131
|
Provenance
The following attestation bundles were made for deco_assaying-0.1.3-py3-none-any.whl:
Publisher:
release.yml on garycoding/deco-assaying
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
deco_assaying-0.1.3-py3-none-any.whl -
Subject digest:
fe3ad7f510660e8b4292c8431eb71632cffebf887dfaf21f6cbed1b84ae634ba - Sigstore transparency entry: 1436821467
- Sigstore integration time:
-
Permalink:
garycoding/deco-assaying@9f1cab307fa2ad911806261c9abebf4200459be9 -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/garycoding
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9f1cab307fa2ad911806261c9abebf4200459be9 -
Trigger Event:
push
-
Statement type: