A Rust-inspired, Typer-powered CLI for reliably managing project documentation.
Project description
croc
ids are owners, links are borrows,
croc checkis the borrow checker.
A Rust-inspired, Typer-powered CLI for reliably managing project documentation.
croc treats a markdown doc tree the way Rust treats memory: ids are owners, links are borrows, croc check is the borrow checker, and croc rename is an atomic refactor. Move a file and every reference keeps working. Introduce a dangling link and the commit is refused.
The problem
A thoughts/ tree grows branching paths, nested directories with self.md files, and files that reference other files by path. When a file moves, every referrer has to be updated. The usual options all fail:
- By hand: grep for every reference and hope you didn't miss one.
- With
sed: risk rewriting substring matches inside prose or code blocks. - With Obsidian/Notion: silent link rot whenever someone edits outside the editor.
None of these prevent the broken intermediate state — the window where main has dangling refs and no one has noticed. croc check closes that window by refusing the commit; croc rename makes the refactor an atomic transaction.
The idea
Replace path-based references with stable ids. A reference like [[id:registry-pattern]] resolves through a derived index of every id in the tree. When a file moves, the id travels with it; every link still works. When an id changes, one command rewrites every referrer atomically.
Tree-as-memory
| Rust concept | croc concept |
|---|---|
| Ownership | Each .md has a unique id in frontmatter |
| Move semantics | mv relocates bytes; id travels with the file |
&T (borrow) |
[[id:X]] — strong link |
Weak<T> |
[[see:X]] — soft citation, may dangle |
| Lifetimes | Strong links may not outlive their target |
| Newtype pattern | DocId and DocPath are distinct types |
| Borrow checker | croc check refuses trees with broken invariants |
| Validate-then-commit | Rewrites simulated in memory before any disk write |
Quick start
Requires Python >=3.13 and uv.
# Install
uv sync
# Check the included example
uv run croc check examples/thoughts
# Print the derived id → path index
uv run croc index examples/thoughts
# Adopt croc on a repo with plain markdown (preview first)
uv run croc init --adopt --dry-run path/to/docs/
uv run croc init --adopt path/to/docs/
# Rename an id; every referrer updates atomically
uv run croc rename old-id new-id --root path/to/docs/
# Move a file; id-based links mean zero references need rewriting
uv run croc move path/to/docs/a.md path/to/docs/subdir/ --root path/to/docs/
Commands
croc check <root>
Runs the borrow checker. Exit codes:
0— tree is sound1— tree has violations (printed to stderr)2— tree cannot be loaded (malformed frontmatter, missing root)
croc index <root>
Prints the derived id → path map as JSON. The index is never stored — it's a regenerable view over the tree, so it cannot drift.
croc move <src> <dst> [--root R] [--dry-run]
Relocates a file. Because ids are stable, zero references are rewritten. Runs a pre-check so you don't pile a move on a broken tree. Uses git mv when in a git repo, falls back to shutil.move.
croc rename <old-id> <new-id> [--root R] [--dry-run]
Rewrites every strong and weak reference in the tree, plus the owner's id field. Transactional:
- Pre-check the tree is sound.
- Plan the rewrite in memory.
- Simulate the plan: apply in memory, re-parse, re-check.
- Commit atomically per-file (temp +
os.replace); snapshot-based rollback on FS failure.
If any step fails, nothing is written.
croc init [path] [--adopt] [--dry-run]
Creates a .croc.toml marker at path. With --adopt, brings every .md into the managed schema in one of three ways:
- SCAFFOLD — no frontmatter. Prepend a fresh block.
- AUGMENT — has frontmatter but missing required fields. Fill in
id/title/kind/linkswhile preserving every existing key and its order (foreign fields liketype,mirrors,created, ... survive untouched). - SKIP — has frontmatter we can't safely modify (unterminated, invalid YAML, malformed existing
id). The author fixes by hand.
Proposed ids are hierarchical — slugified relative path, not just the filename — so code-adjacent trees with lots of repeated stems (__init__.md, per-customer folders, etc.) don't collide:
| Path | Proposed id |
|---|---|
foo.md (root) |
foo |
sub/foo.md |
sub-foo |
pkg/utils/__init__.md |
pkg-utils-init |
regions/east/notes.md |
regions-east-notes |
alerts/self.md |
alerts (directory-index convention) |
self.md (root) |
root |
Collisions (rare path-slug ambiguities, or foo.md at root competing with foo/self.md) are reported and the command refuses to write.
croc molt <root> [--dry-run]
Reverse adoption. Rewrites every [[id:X]] / [[see:X]] body ref back into [text](path.md) plain markdown, strips croc-specific frontmatter fields (id, kind, links), and removes .croc.toml. The tree must pass croc check first.
| Before | After |
|---|---|
[[id:foo|foo]] |
[foo](foo.md) |
[[id:target#section-x|Section X]] |
[Section X](target.md#section-x) |
[[id:data-glossary]] (bare) |
[Data Glossary](data-glossary.md) (falls back to target's title) |
Foreign frontmatter (title, type, mirrors, any custom keys) is preserved in original order. The molted tree renders correctly in GitHub, Obsidian, or any generic markdown tool. Re-adopt with croc init --adopt to come back under croc management; the round-trip is semantically equivalent.
croc refs <root> [--unresolved]
Walks the tree and reports every markdown-style path ref ([text](path.md)), showing whether each target resolves to a file under the root. Read-only; works on any markdown tree whether or not it's been adopted. Use as a health check before init --adopt --migrate-refs:
croc refs --unresolved path/to/docs/
# UNRESOLVED runbooks/onboarding.md: -> ghost.md
# 1 unresolved ref(s) across the tree
Exits 1 when any ref is unresolved. Great for CI on partially-migrated trees.
Ref migration (on init --adopt, default on)
Adoption rewrites markdown path refs in body text to the croc dialect by default:
| Before | After |
|---|---|
[foo](foo.md) |
[[id:foo|foo]] |
[Section X](target.md#section-x) |
[[id:target#section-x|Section X]] |
[Data Glossary](../data_glossary.md) |
[[id:data-glossary|Data Glossary]] |
Link text and anchors are preserved. Frontmatter links gets a strong entry for every migrated target (so Rule 5 — identity — is satisfied post-migration).
Pass --no-migrate-refs to adopt only the frontmatter shape and leave body content untouched — useful if you want to stage the migration separately.
Re-running on an adopted tree is safe. If a previously-adopted file grows new path-refs later (someone pastes a markdown link, a new doc lands), the next init --adopt reaches that file and migrates the new refs. Clean trees produce zero actions — the command is idempotent.
Unresolvable refs (target doesn't exist, or escapes the tree root, or uses non-lowercase .md extension) are left in place as raw markdown and surfaced as SKIP-REF notes. Brownfield trees always have some rot; adoption reports it rather than refusing to land.
Why not teach check to recognize path refs directly? Because path refs break on move — which is the exact failure mode croc exists to prevent. The checker's narrow [[id:X]] dialect IS the enforcement; loosening it would defeat the purpose.
--dry-run
Every mutating command (move, rename, init --adopt, init --adopt --migrate-refs) accepts --dry-run. It runs every validation and prints the plan but writes nothing.
Concepts
Frontmatter
Every managed .md file has YAML frontmatter:
---
id: registry-pattern
title: Registry pattern
kind: leaf
links:
- { to: design-index, strength: strong }
- { to: obsidian-comparison, strength: weak }
---
The body can reference other docs: [[id:design-index]] or [[see:obsidian-comparison]].
Refs support optional anchors and display text: [[id:design-index#intro|the intro]].
Required fields: id, title, kind, links.
Ref dialect: [[id:X]], [[id:X#anchor]], [[id:X|display text]], [[id:X#anchor|display text]]. Only the id is load-bearing for invariant checking; the anchor and display text are preserved for renderers and consumers.
Id grammar: [A-Za-z0-9_.-]+. UUIDs, slugs, dotted namespaces all legal. Spaces and slashes aren't.
kind: self for directory index files (self.md), leaf for everything else.
Strong vs weak links
A strong link pins its target. If the target is deleted or renamed, the commit is refused.
links:
- { to: adr-0042, strength: strong }
A weak link cites a target without pinning it. If the target is absent, the link is silently tolerated — it's the "see also" tier.
links:
- { to: obsidian-comparison, strength: weak }
Use strong for load-bearing citations (a runbook referencing the ADR it implements). Use weak for breadcrumbs.
The five rules
croc check enforces:
- Ownership — every
.mdhas a uniqueid. - Schema — frontmatter has
title,kind,links. - No dangling ref — every
[[id:X]]in body text resolves to a doc. - Lifetime bound — strong links in frontmatter point to docs that exist.
- Identity stable — the set of strong links declared in frontmatter equals the set of
[[id:X]]in the body.
Weak links are exempt from rules 3 and 4 by design.
Where croc fits (and doesn't)
Good fits
- Engineering knowledge bases (ADRs, runbooks, postmortems that cite each other)
- LLM/agent context stores where agents read and write the tree and need integrity guarantees
- Compliance and audit trails where "the chain is unbroken" is the artifact
- Internal dev docs at 50+ engineer companies where rot is the rule
Bad fits
- Personal Zettelkasten — Obsidian's ergonomics (graph view, backlinks pane) beat a linter for daily use
- Fast-moving drafts and brainstorming — the schema is friction before content exists
- Teams that don't run CI or pre-commit hooks — the whole value is mechanical enforcement
Using croc in CI
As a pre-commit hook
Add to .pre-commit-config.yaml:
repos:
- repo: local
hooks:
- id: croc-check
name: croc check
entry: uv run croc check path/to/docs/
language: system
pass_filenames: false
files: ^path/to/docs/
Or as a plain .git/hooks/pre-commit:
#!/bin/sh
uv run croc check path/to/docs/ || exit 1
In GitHub Actions
- name: croc check
run: |
uv sync
uv run croc check path/to/docs/
For contributors
Layout
croc/
├── croc/
│ ├── __init__.py
│ ├── check.py # borrow checker; pure over list[Doc]
│ └── ops.py # transformations: move, rename, init, adopt
├── main.py # Typer CLI — thin wrapper around ops
├── tests/
│ ├── conftest.py # shared fixtures (tmp_path trees)
│ ├── test_check.py # parser + five rules
│ └── test_ops.py # move, rename, init, adopt, dry-run
├── docs/design.md # full Rust-inspired rationale
├── examples/thoughts/ # canonical sample tree
└── pyproject.toml
Design principles
Separation of concerns. check.py is verification (pure, no I/O). ops.py is transformation (parse → check → plan → simulate → commit). Each raises a typed exception — TreeError for parse failures, OpError for precondition failures — so the CLI can map cleanly to exit codes.
Validate-then-commit. No operation writes to disk until its plan has been simulated and re-checked in memory. If validation fails, nothing is half-committed; if the physical commit fails mid-sequence, a snapshot-based rollback restores the already-written files.
Newtype discipline. DocId and DocPath are distinct NewType aliases over str. The parser enforces the id grammar at the boundary, so runtime values match their declared types.
--dry-run is universal. Every mutating operation accepts dry_run=True and skips only the final commit step. The simulation machinery is the same either way, so dry-run and real runs exercise identical code paths.
Adding a new command
- Add the implementation to
croc/ops.py. RaiseOpErroron precondition failures. Follow load → check → plan → simulate → commit. - Wire the Typer command in
main.py. Keep it thin — CLI only formats output and maps exceptions to exit codes. - Add tests in
tests/test_ops.py. Use thetmp_path+write_doc/sample_treefixtures. - Add a
--dry-runflag if the command writes.
Running tests
uv sync --group dev
uv run pytest # 62 tests, ~0.1s
uv run pytest -v # verbose
uv run pytest -k rename # filter by name
The test suite encodes the guarantees as regressions. Notable cases:
test_failed_rename_leaves_tree_unchanged— fingerprints the tree, runs four failing renames back-to-back, asserts no file changed. Captures the validate-then-commit property.test_post_adopt_check_passes— adopts a fresh unmanaged tree, then runscheck. Provesinit --adoptproduces a sound tree out of the gate.test_dry_run_writes_nothing(×3) — fingerprint-before / dry-run / fingerprint-after, applied uniformly across move, rename, and adopt.
Known limitations
- YAML round-trip formatting.
renamere-serializes frontmatter; inline flow style{ to: X, strength: Y }may render as{to: X, strength: Y}. Cosmetic — swapyaml.dumpforruamel.yamlif formatting preservation matters. - No
.crocignore. Trees with vendored READMEs or generated files needinit --adoptpointed at a subdirectory. - Symlinked subtrees are not traversed.
scan_symlinksemits warnings; the user decides whether to follow.
Further reading
docs/design.md— full design rationale and the Rust-to-croc mapping.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file croc_cli-0.1.0.tar.gz.
File metadata
- Download URL: croc_cli-0.1.0.tar.gz
- Upload date:
- Size: 53.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b477f3784bd791e7d872eb60a85e580a3e193b34c194872099bf514ae381ef7f
|
|
| MD5 |
5bdd77cd51d194309a8a7e001ac26729
|
|
| BLAKE2b-256 |
9aaee5f61020b3ae970bb8b2825091982b8cda38818dec0ac72250e3c24272f8
|
File details
Details for the file croc_cli-0.1.0-py3-none-any.whl.
File metadata
- Download URL: croc_cli-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
249be90bfd28a6d514511c41ce03d365d45301dd2a93c02d20367a57ff02ae10
|
|
| MD5 |
87d2a6baeae741f6af0886158cc44316
|
|
| BLAKE2b-256 |
7bd9d0d9001fcf764ad777104caff6f3cb9d3b3276f7d279f72387d1d2c20122
|