Analyze a local repository and generate AI-readable project instruction files from a single repo model.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

nehharshah

These details have not been verified by PyPI

Project description

RepoCanon

Generate repo-specific AI context for Codex, Claude Code, Copilot, and Cursor.

Turn any repository into canonical AI-readable project context.

RepoCanon is a Python CLI that analyzes a local codebase and generates project-specific instruction files for AI coding tools from a single internal repo model.

Instead of manually maintaining separate context for different tools, RepoCanon infers your repo’s structure, commands, conventions, and boundaries, then generates outputs such as:

AGENTS.md
CLAUDE.md
Copilot repository instructions
Cursor project rules

The goal is simple: make AI coding tools behave like they already understand your repo.

Why RepoCanon

AI coding tools are useful, but they usually guess:

where things live
how the repo is structured
which commands to run
what patterns are preferred
what boundaries should not be crossed

RepoCanon reduces that guesswork by turning repo-specific knowledge into maintainable instruction files.

What it does

RepoCanon:

analyzes a local repository
detects languages, frameworks, commands, and topology
infers conventions and architectural boundaries
builds a normalized project model
generates tool-specific AI context files from that model

RepoCanon is deterministic-first. It does not require an LLM to work.

Supported targets

Codex via AGENTS.md
Claude Code via CLAUDE.md
GitHub Copilot via .github/copilot-instructions.md (and optional path-scoped files)
Cursor via .cursor/rules/*.mdc

Installation

pip install repocanon

Requires Python 3.11+.

Quickstart

# 1. Analyze the current repo and persist a normalized model.
repocanon analyze .

# 2. Inspect what was inferred and how confident RepoCanon is.
repocanon audit .

# 3. Preview generated outputs without touching the filesystem.
repocanon preview .

# 4. Write the generated files into the repo (defaults to all targets).
repocanon generate .

You can also generate one or more specific targets:

repocanon generate . -t agents
repocanon generate . -t claude -t copilot
repocanon list-targets         # see what's available

Need machine-readable output? Append --json to analyze, audit, or diff. Need to undo? repocanon clean . (or scope it with -t copilot) removes only RepoCanon-authored files (those that carry the generator header marker).

Example outputs

A real run produces files like:

AGENTS.md
CLAUDE.md
.cursor/rules/project-overview.mdc
.cursor/rules/commands-and-validation.mdc
.cursor/rules/code-style-and-conventions.mdc
.cursor/rules/architecture-boundaries.mdc
.github/copilot-instructions.md
.github/instructions/tests.instructions.md

See docs/samples/ for sample generated files from the bundled fixture repos.

How it works

RepoCanon has three layers:

1. Repo analysis

It scans the local repo and extracts:

languages
frameworks
package managers
commands
configs
directory structure
file patterns

2. Convention inference

It infers patterns such as:

test layout (centralized vs colocated)
frontend/backend split
monorepo structure (apps/packages/libs/services)
architectural boundaries
naming conventions
preferred libraries
common anti-pattern risks (e.g. editing existing migrations)

3. Target generation

It maps one normalized project model into tool-specific outputs.

That means the same repo understanding can be reused across multiple AI coding tools.

Design principles

deterministic first
local-first (no telemetry, no network calls)
tool-agnostic core
small, readable outputs
no generic filler — every section is grounded in repo facts
explicit uncertainty when confidence is low
human-editable generated files (sections between  markers survive regeneration)

Commands

`repocanon analyze [PATH]`

Analyze the repository and write a normalized model to:

.repocanon/project-model.json

`repocanon generate [PATH]`

Generate output for the selected targets (defaults to all).

Useful flags:

-t, --target — repeat to pick targets (agents, claude, copilot, cursor, or all)
--dry-run — render and report without writing or persisting the model
--output-dir — write into a sibling directory (path-traversal-safe)
--force — replace files even if RepoCanon would otherwise preserve manual edits

`repocanon preview [PATH]`

Print generated output to the terminal without writing files. Same -t flag as generate.

`repocanon list-targets`

List every target the current build supports.

`repocanon clean [PATH]`

Delete RepoCanon-authored files for the selected targets. Files without the RepoCanon header marker (i.e. anything you wrote yourself) are skipped. Pair with --dry-run to see what would happen.

`repocanon audit [PATH]`

Show inferred conventions, rationale, and confidence levels. Pass --json to emit the audit as JSON for piping into other tools.

`repocanon diff [PATH]`

Compare the current repo scan with the saved model and report meaningful changes (specific commands added/removed, package list deltas, structural fingerprint changes). Pass --json for a machine-readable diff.

`repocanon init [PATH]`

Create a local RepoCanon config file at .repocanon/config.toml.

Configuration

RepoCanon stores project config in:

.repocanon/config.toml

Example:

[project]
name = "my-repo"

[scan]
include = ["src/**", "app/**", "packages/**"]
exclude = ["node_modules/**", ".next/**", "dist/**", "build/**"]

[generate]
targets = ["agents", "claude", "copilot", "cursor"]
safe_overwrite = true

Architecture overview

repocanon/
├── analyzer/    # deterministic repo scanning + inference
├── models/      # Pydantic v2 project model
├── generators/  # one module per AI target
├── output/      # writers, preview, diff
├── report/      # audit + summary tables
└── cli.py       # Typer entry point

The analyzer is a straight pipeline: file inventory → manifest parsing → framework/package-manager detection → command extraction → topology + conventions → final ProjectModel. Generators only consume that model — they never touch the filesystem.

Limitations

RepoCanon is inference-based. It can detect a lot, but not everything.

It may be less accurate when:

the repo is highly unconventional
conventions are implicit rather than visible in files
commands live outside standard manifests
architecture is unclear from structure alone

When confidence is low, RepoCanon says so rather than inventing detail.

Roadmap

more framework detectors (Django, Rails, .NET, Spring, etc.)
stronger monorepo inference (Bazel, Pants, Nx graph)
better path-scoped output generation
safer merge/update behavior for edited generated files
optional LLM-assisted summarization (off by default)
additional target formats

Why not just write these files manually?

You can. But in practice:

they drift out of date
they are inconsistent across tools
they are often generic
they rarely reflect the actual repo structure

RepoCanon keeps those files grounded in the codebase.

How RepoCanon maps one repo model to multiple AI coding tools

RepoCanon is intentionally a many-to-one-to-many pipeline:

repo files ─┐                              ┌─► AGENTS.md            (Codex)
            ├─► analyzer ─► ProjectModel ──┼─► CLAUDE.md            (Claude Code)
manifests  ─┘                              ├─► copilot-instructions (Copilot)
                                           └─► .cursor/rules/*.mdc  (Cursor)

The analyzer collapses everything it sees into a single normalized ProjectModel (Pydantic v2). That model is the only thing target generators read; they never touch the filesystem. This gives RepoCanon two important properties:

One source of truth. Languages, frameworks, commands, conventions, anti-patterns, and architecture boundaries all live in one place. Adding a new target means writing a new generator that consumes the same model — not re-implementing detection.
Idiomatic outputs per tool. Each generator picks the parts of the model that make sense for its target and renders them in that tool's idiom: a verbose AGENTS.md for Codex, a terse CLAUDE.md for Claude Code, a repo-wide instructions file (plus optional path-scoped ones) for Copilot, and a small set of focused .mdc rule files for Cursor.

The same model also powers audit, diff, and preview, so you can verify what RepoCanon inferred before any file is written.

Contributing

Contributions are welcome. See CONTRIBUTING.md for local setup, tests, and development workflow.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

nehharshah

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.1

May 9, 2026

0.2.0

Apr 19, 2026

0.1.3

Apr 19, 2026

0.1.2

Apr 19, 2026

0.1.1

Apr 19, 2026

0.1.0

Apr 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repocanon-0.2.1.tar.gz (57.3 kB view details)

Uploaded May 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

repocanon-0.2.1-py3-none-any.whl (66.6 kB view details)

Uploaded May 9, 2026 Python 3

File details

Details for the file repocanon-0.2.1.tar.gz.

File metadata

Download URL: repocanon-0.2.1.tar.gz
Upload date: May 9, 2026
Size: 57.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for repocanon-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`8b84f91a381f882c34d731a796d4339badf32bf17ee7ffb65118ee5ad2e77eba`
MD5	`f57524201a443d88f4fd81b7b1c2a8c2`
BLAKE2b-256	`a495bd37b7e7776fd26f28af3207ae05d380f1a0a936020a27135ec5fffd3193`

See more details on using hashes here.

Provenance

The following attestation bundles were made for repocanon-0.2.1.tar.gz:

Publisher: release.yml on NehharShah/repocanon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: repocanon-0.2.1.tar.gz
- Subject digest: 8b84f91a381f882c34d731a796d4339badf32bf17ee7ffb65118ee5ad2e77eba
- Sigstore transparency entry: 1485961477
- Sigstore integration time: May 9, 2026
Source repository:
- Permalink: NehharShah/repocanon@34cbfae8d5a4575261ad4b0b5567ebcd028fdfdd
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/NehharShah
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@34cbfae8d5a4575261ad4b0b5567ebcd028fdfdd
- Trigger Event: push

File details

Details for the file repocanon-0.2.1-py3-none-any.whl.

File metadata

Download URL: repocanon-0.2.1-py3-none-any.whl
Upload date: May 9, 2026
Size: 66.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for repocanon-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9fe53c9f4c8b50fbe03f4fe46401e179038016b45e03c6ec2fb3a2b4c12e9dc1`
MD5	`0ed324217a766929439bd6d9553b7573`
BLAKE2b-256	`ccf6bcc44a345f18b20e86a97cfa375eea31f64f3467dab17b0d080821b4ab69`

See more details on using hashes here.

Provenance

The following attestation bundles were made for repocanon-0.2.1-py3-none-any.whl:

Publisher: release.yml on NehharShah/repocanon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: repocanon-0.2.1-py3-none-any.whl
- Subject digest: 9fe53c9f4c8b50fbe03f4fe46401e179038016b45e03c6ec2fb3a2b4c12e9dc1
- Sigstore transparency entry: 1485961506
- Sigstore integration time: May 9, 2026
Source repository:
- Permalink: NehharShah/repocanon@34cbfae8d5a4575261ad4b0b5567ebcd028fdfdd
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/NehharShah
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@34cbfae8d5a4575261ad4b0b5567ebcd028fdfdd
- Trigger Event: push

repocanon 0.2.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

RepoCanon

Why RepoCanon

What it does

Supported targets

Installation

Quickstart

Example outputs

How it works

1. Repo analysis

2. Convention inference

3. Target generation

Design principles

Commands

repocanon analyze [PATH]

repocanon generate [PATH]

repocanon preview [PATH]

repocanon list-targets

repocanon clean [PATH]

repocanon audit [PATH]

repocanon diff [PATH]

repocanon init [PATH]

Configuration

Architecture overview

Limitations

Roadmap

Why not just write these files manually?

How RepoCanon maps one repo model to multiple AI coding tools

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`repocanon analyze [PATH]`

`repocanon generate [PATH]`

`repocanon preview [PATH]`

`repocanon list-targets`

`repocanon clean [PATH]`

`repocanon audit [PATH]`

`repocanon diff [PATH]`

`repocanon init [PATH]`