Python reference implementation of the markstay spec (v1.1): source-level block identity for Markdown
Project description
markstay , Python reference implementation (v1 core)
The Python reference implementation of the markstay spec
(v1.1). markstay is a source-level identity primitive for Markdown blocks: an id
token that stays bound to its block across edits (marker stay:), so a
reference to a block survives the document being rewritten, including by an LLM.
This is the parser-free core: everything string-level and parser-independent
(§8 hashing, §3/§4 marker grammar, §5 blank-line segmentation, §6 id minting, the
§3/§4/§7/§8 write path, §7/§11 lint, §9 quote recovery, §9.1 resolution ladder).
It mirrors the JavaScript reference
(markstay on npm); both are gated by a
shared language-neutral conformance corpus, which turns "two implementations
agree" from an assertion into a tested fact.
Install
pip install markstay
Zero runtime dependencies (Python standard library only). CommonMark-tree segmentation (§5.2) is an optional extra:
pip install "markstay[commonmark]" # pulls in markdown-it-py
Requires Python >= 3.9.
Library
import markstay as M
md = "The ingest stage retries three times.\n<!-- stay:a1b2 -->\n"
# parse into content blocks with attached markers (§5)
blocks = M.parse_document(md)
# well-formedness + intra-doc invariants (§7): duplicate/orphan/malformed/drift
_, findings = M.lint_document(md)
# regeneration diff (§11): what an edit did to the ids (dropped/duplicated/moved)
findings = M.lint_diff(before_md, after_md)
# §8 content hash (ASCII-normalized SHA-256)
M.body_hash("some block body")
# §9.1 resolution ladder: re-attach ids after an edit, or report DETACHED
anchors = M.build_anchors(before_md)
resolutions = M.resolve(anchors, after_md) # id -> marker | hash | quote | detached
# write path: mint ids for unmarked blocks (§6), append the §3.1 trailing marker
res = M.stamp("First paragraph.\n\nSecond paragraph.\n")
res.text # each block now carries <!-- stay:ID hash=sha256:... -->
res.minted # [{"id": ..., "line": ...}, ...]
# refresh a hash you edited on purpose (§8); repair duplicate ids (§7, copy mints new)
M.restamp(edited_md) # -> RestampResult(text, refreshed)
M.repair_duplicates(copied_md) # -> RepairResult(text, renamed)
Public API (mirrors the JS index.js surface): normalize_body, body_hash,
Marker, find_markers, strip_markers, rewrite_markers,
segment_blank_line, segment_commonmark, Block, parse_document, Finding,
lint_document, lint_diff, sort_findings, has_errors, mint_id,
ID_CHARSET, format_marker, format_attr_value, stamp, restamp,
repair_duplicates, DEFAULT_HASH_LENGTH, Selector, normalize,
body_score, context_bonus, best_match, CONTEXT_CHARS, Anchor,
Resolution, build_anchors, resolve, DEFAULT_THRESHOLD, DEFAULT_MARGIN.
CLI
markstay lint FILE [FILE ...] # well-formedness + intra-doc checks
markstay lint --before OLD.md NEW # regeneration diff (dropped/duplicated/relocated ids)
markstay lint --json ... # machine-readable findings
markstay lint --commonmark ... # §5.2 CommonMark-tree segmentation (needs the extra)
markstay stamp FILE... [-w] # mint ids for unmarked blocks (§6)
markstay restamp FILE... [-w] # refresh hashes that drifted (§8)
markstay repair FILE... [-w] # mint fresh ids for duplicate ids (§7)
lint exits non-zero when any error-level finding is reported, so it gates a
commit hook or an agent's post-edit step. The write verbs print the result to
stdout by default; -w/--write edits files in place.
The conformance corpus (the actual deliverable)
The corpus under conformance/ is shared with the JavaScript
reference. 290 vectors across two tiers:
spec/, hand-authored from the spec prose, asserting what the words require. These are authority; aspec/vector the reference fails is a reference bug, not a corpus error.gen/, emitted from the reference for breadth/regression.
The JS reference runs the same JSON, so the two runners are a cross-impl regression sentinel: any later change to either implementation that breaks agreement fails one of them.
Running the tests
pip install -e ".[commonmark]"
pytest
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file markstay-0.3.0.tar.gz.
File metadata
- Download URL: markstay-0.3.0.tar.gz
- Upload date:
- Size: 45.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd5ec4a03d60e7750b6623b6b734eb1f6f1961ea4f780b49cc6e43f9e23fc639
|
|
| MD5 |
5646a6c62ed5758cfce2faf8bdf8ad9f
|
|
| BLAKE2b-256 |
b1ff7a21862a1c0a216616c5f3b362bf9f817a93c0c7b9cf1ee0327458661f41
|
Provenance
The following attestation bundles were made for markstay-0.3.0.tar.gz:
Publisher:
publish.yml on markstaymd/markstay-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markstay-0.3.0.tar.gz -
Subject digest:
bd5ec4a03d60e7750b6623b6b734eb1f6f1961ea4f780b49cc6e43f9e23fc639 - Sigstore transparency entry: 1942617868
- Sigstore integration time:
-
Permalink:
markstaymd/markstay-py@d3b844e569b1550e4636d59dbe3cfff1b0648a80 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/markstaymd
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d3b844e569b1550e4636d59dbe3cfff1b0648a80 -
Trigger Event:
release
-
Statement type:
File details
Details for the file markstay-0.3.0-py3-none-any.whl.
File metadata
- Download URL: markstay-0.3.0-py3-none-any.whl
- Upload date:
- Size: 27.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d92c9a06ba7311610c2737c8b1f447cf20e9a9a91b4ff9de4822759172fc650d
|
|
| MD5 |
f8244084f0987181111cffaa4295ca3e
|
|
| BLAKE2b-256 |
5a92d7bc62d670a201799ebcf5849e1aa08fda107fe6dcb7ad750270579e051d
|
Provenance
The following attestation bundles were made for markstay-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on markstaymd/markstay-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markstay-0.3.0-py3-none-any.whl -
Subject digest:
d92c9a06ba7311610c2737c8b1f447cf20e9a9a91b4ff9de4822759172fc650d - Sigstore transparency entry: 1942617986
- Sigstore integration time:
-
Permalink:
markstaymd/markstay-py@d3b844e569b1550e4636d59dbe3cfff1b0648a80 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/markstaymd
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d3b844e569b1550e4636d59dbe3cfff1b0648a80 -
Trigger Event:
release
-
Statement type: