Distill any code repo into a compact, secret-redacted LLM context pack — and fit it to a token budget.
Project description
robinctx 🐦
Distill any code repo into a compact, secret-redacted LLM context pack — then fit it to a token budget.
Every LLM works better when it knows your repo's purpose, stack, conventions, and API surface. robinctx extracts exactly that — heuristically, locally, with zero dependencies — and packages it as a single Markdown document (plus a machine-readable JSON sidecar) sized for a prompt. The sidekick that preps the briefing so the hero can fight crime.
30-second quickstart
# no install needed:
uvx robinctx distill .
# or
pipx run robinctx distill .
# then build a task-focused prompt from the pack:
uvx robinctx pack myrepo_context.json --task "add rate limiting to the API" --budget 8000
# or do both at once:
uvx robinctx distill . --pack --task "add rate limiting to the API"
Runs anywhere Python 3.9+ runs — including locked-down environments — because the core uses only the standard library. Git and gitleaks are used when present, never required.
The disclaimer philosophy
Every artifact robinctx generates starts with this block, on purpose:
⚠️ AUTO-GENERATED — USE AT YOUR OWN RISK This context pack was produced by robinctx by heuristic analysis. It may contain errors, omissions, or — despite redaction — sensitive data. Review before sharing outside your trust boundary, and verify any claims (especially refactor suggestions) against the actual source.
Heuristics are honest about being heuristics. The disclaimer carries provenance (tool version, timestamp, repo commit SHA, scan settings) and redaction counts, so anyone downstream — human or LLM — knows exactly what they're holding and how much to trust it.
What gets captured
| Section | How | Notes |
|---|---|---|
| Overview & docs | README / ARCHITECTURE / CLAUDE.md excerpts | redacted, capped |
| Tech stack | manifests (package.json, pyproject, go.mod, Cargo.toml, Gemfile, …) | framework inference from deps |
| Conventions | statistical style inference (indentation, quotes, naming, semicolons, type-hint ratio) | from up to 60 source samples |
| Layout | rendered directory tree + entry points | depth/size capped |
| API surface | Python: ast (precise, incl. async/decorators/methods) • JS/TS/Go/Rust/Ruby/Java: regex (best effort) | public symbols only |
| Git intel | branch, recent commits, churn hotspots, contributor count | optional, if git present |
| TODO/FIXME markers | comment scan | redacted, capped at 300 |
| Refactor signals | large files, long functions, churn×size hotspots, TODO clusters, missing tests | heuristic — verify before acting |
Security model
The output of this tool is destined to be pasted into LLM prompts and shared. Three independent layers stand between your secrets and that output:
- File exclusion — credential-like files (
.env*,*.pem,secrets.*,id_rsa,.ssh/,.aws/, …) are never read and never listed (names alone can leak). Inside a git repo, files are enumerated viagit ls-files --exclude-standard, so anything.gitignore'd — where local secrets usually live — is never touched. A.robinctxignorefile adds your own exclusions. - Inline redaction — every embedded excerpt is scrubbed for known token formats (AWS,
GitHub, Slack, Google, OpenAI/Anthropic-style, JWTs, private-key blocks, credentialed URLs),
secret-keyed assignments (
password = …), and high-entropy values (>4.5 bits/char, ≥20 chars, assigned to a variable — hex digests andshaNNN-SRI hashes are exempt). - Output scanning — after generation, the artifacts themselves are scanned with
gitleaks (or trufflehog) if installed, falling back to the built-in detectors with a
notice. Findings print as
file:line [rule], the output is quarantined (renamed*.quarantined), and the run exits 3 — CI-friendly.
Flags: --no-secret-scan opts out entirely; --strict also fails on built-in-scanner findings
(recommended in CI). Exit codes: 0 ok • 1 error • 2 usage • 3 leaks found.
Found a leak that survived all three layers? That's a vulnerability — see SECURITY.md.
Library API
pip install robinctx and build on the same engine (fully typed, py.typed shipped):
from robinctx import distill, pack, to_markdown
context = distill("path/to/repo") # dict — the JSON-sidecar structure
print(context["style"], context["frameworks"])
markdown = to_markdown(context) # the .md artifact, disclaimer included
result = pack(context, task="refactor the auth module", budget=8000)
print(result.prompt) # budget-fitted prompt
print(result.sections, result.est_tokens)
distill() is a pure function over the filesystem (writes nothing); the CLI owns file output
and scanning. The sidecar dict carries schema_version with a
documented compatibility contract.
CLI reference
robinctx distill <repo> [-o NAME] [--max-file-kb N]
[--format md|json|both|claude-md|agents-md|cursorrules]
[--no-secret-scan] [--strict]
[--pack --task "..." [--mode M] [--budget N] [--sections ...]]
robinctx pack <context.json> [--task "..."] [--mode task|onboard|refactor]
[--budget N] [--sections overview,style,api,...]
[--since REF] [-o FILE]
robinctx update <context.json> [--strict] [--no-secret-scan]
robinctx serve <context.json> # requires robinctx[mcp]
Pack modes prioritize differently when trimming to budget: task leads with conventions and
relevant APIs, onboard with overview and layout, refactor with signals and git hotspots.
With a --task, API entries / TODOs / refactor signals are relevance-ranked so the most useful
detail survives trimming. --since <ref> prepends a redacted "Recent Changes" section (git
log + diff stat) — useful for LLMs working on actively evolving repos.
Agent files: --format claude-md emits a ready-to-commit CLAUDE.md (likewise
agents-md → AGENTS.md, cursorrules → .cursorrules) — a condensed, imperative version of
the pack for coding agents that re-read it on every task. The secret-scan gate applies to these
too.
Staying fresh: robinctx update ctx.json is a no-op when the repo hasn't changed since the
recorded commit SHA, and re-distills when it has — cheap enough for a pre-commit hook or CI step.
See docs/recipes.md for ready-made GitHub Action and pre-commit configs.
.robinctxignore
Drop a .robinctxignore (or .repoctxignore) file at the repo root to exclude more files,
using a gitignore-flavored subset (fnmatch wildcards; dir/ for directories; leading /
anchors to root; ! negation and git-style ** are not supported — * matches across /).
Limitations (read this)
- Non-Python extraction is regex-based. It catches conventional declarations and misses
clever ones; interfaces may be labeled
class. Python usesastand is precise. - Refactor signals are heuristics — line counts, churn, TODO density. They're prompts for investigation, not findings. The output says so.
- Redaction is pattern-based. A password that looks like an English word in prose will not be caught. The entropy detector can't see secrets shorter than ~23 characters (Shannon entropy of a string is bounded by log2 of its length), and may rarely flag random-looking identifiers. Run with gitleaks installed; review output before sharing.
- Token counts are estimates (len/4) unless you install
robinctx[tokens].
Extras
| Install | Adds |
|---|---|
pip install robinctx |
everything above, stdlib-only |
pip install robinctx[tokens] |
exact token counts via tiktoken |
pip install robinctx[mcp] |
robinctx serve — MCP server exposing the pack as queryable tools |
Contributing
See CONTRIBUTING.md. Security-relevant changes require tests, no exceptions.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robinctx-0.1.0.tar.gz.
File metadata
- Download URL: robinctx-0.1.0.tar.gz
- Upload date:
- Size: 53.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3da7adfa4e805a506d43ac07569e34d5f01a06fd3a30efa4ef6e6bcd40aee639
|
|
| MD5 |
c38d7aa496e9afc75b9ac13a200aa4b0
|
|
| BLAKE2b-256 |
f8f6bb33cb35267a54e318b12c4ccf822b94282e47ecb187bfe3bf78aa21c6ed
|
Provenance
The following attestation bundles were made for robinctx-0.1.0.tar.gz:
Publisher:
release.yml on kp-dubbs/robinctx
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
robinctx-0.1.0.tar.gz -
Subject digest:
3da7adfa4e805a506d43ac07569e34d5f01a06fd3a30efa4ef6e6bcd40aee639 - Sigstore transparency entry: 1782950416
- Sigstore integration time:
-
Permalink:
kp-dubbs/robinctx@888d44185db7537ebfe85e8f044cc005151d6d8b -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kp-dubbs
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@888d44185db7537ebfe85e8f044cc005151d6d8b -
Trigger Event:
push
-
Statement type:
File details
Details for the file robinctx-0.1.0-py3-none-any.whl.
File metadata
- Download URL: robinctx-0.1.0-py3-none-any.whl
- Upload date:
- Size: 39.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
320c363cf3bea5b829f3841a4843776320aa6874ed7f2b89a20799df8c3f3b38
|
|
| MD5 |
1c850a117c8e6c376a5b1a741d2d3723
|
|
| BLAKE2b-256 |
17d380c65b44424df4ef5b3a0806c68036b6464a0c0421bae50041bc933a6475
|
Provenance
The following attestation bundles were made for robinctx-0.1.0-py3-none-any.whl:
Publisher:
release.yml on kp-dubbs/robinctx
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
robinctx-0.1.0-py3-none-any.whl -
Subject digest:
320c363cf3bea5b829f3841a4843776320aa6874ed7f2b89a20799df8c3f3b38 - Sigstore transparency entry: 1782950529
- Sigstore integration time:
-
Permalink:
kp-dubbs/robinctx@888d44185db7537ebfe85e8f044cc005151d6d8b -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kp-dubbs
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@888d44185db7537ebfe85e8f044cc005151d6d8b -
Trigger Event:
push
-
Statement type: