Skip to main content

A YAML round-trip library that preserves comments and insertion order

Project description

yarutsk

A Python YAML library that round-trips documents while preserving comments, insertion order, scalar styles, tags, anchors and aliases, blank lines, and explicit document markers.

What it does

Most YAML libraries silently drop comments on load. yarutsk keeps them attached to their keys — both inline (key: value # like this) and block-level (# above a key) — so a load → modify → dump cycle leaves the rest of the file intact.

import io
import yarutsk

doc = yarutsk.load(io.StringIO("""
# database config
host: localhost  # primary
port: 5432
"""))

doc["port"] = 5433

out = io.StringIO()
yarutsk.dump(doc, out)
print(out.getvalue())
# # database config
# host: localhost  # primary
# port: 5433

YamlMapping is a subclass of dict and YamlSequence is a subclass of list, so they work everywhere a dict or list is expected:

import json

doc = yarutsk.loads("name: Alice\nscores: [10, 20, 30]")

isinstance(doc, dict)           # True
isinstance(doc["scores"], list) # True
json.dumps(doc)                 # '{"name": "Alice", "scores": [10, 20, 30]}'

Round-trip fidelity

yarutsk reproduces the source text exactly for everything it understands. A loads followed by dumps gives back the original string byte-for-byte in the common case:

src = """\
defaults: &base
  timeout: 30
  retries: 3

service:
  name: api
  config: *base
"""
assert yarutsk.dumps(yarutsk.loads(src)) == src

Specifically preserved:

  • Scalar styles — plain, 'single-quoted', "double-quoted", literal block |, folded block >
  • Non-canonical scalarsyes/no/on/off, ~, Null, True/False, 0xFF, 0o77 — reproduced as written, not re-canonicalised to true/false/null/255
  • YAML tags!!str, !!python/tuple, and any custom tag are emitted back verbatim
  • Anchors and aliases&name on the anchor node and *name for references are preserved; the Python layer returns the resolved value transparently
  • Blank lines between mapping entries and sequence items
  • Explicit document markers--- and ...

Installation

Built with Maturin. From the repo root:

pip install maturin
maturin develop

API

Loading and dumping

# Load from stream (StringIO / BytesIO)
doc  = yarutsk.load(stream)            # first document
docs = yarutsk.load_all(stream)        # all documents as a list

# Load from string
doc  = yarutsk.loads(text)
docs = yarutsk.loads_all(text)

# Dump to stream
yarutsk.dump(doc, stream)
yarutsk.dump_all(docs, stream)

# Dump to string
text = yarutsk.dumps(doc)
text = yarutsk.dumps_all(docs)

load / loads return a YamlMapping, YamlSequence, YamlScalar, or None (for empty input). Nested nodes are also YamlMapping or YamlSequence; scalar leaves are returned as native Python primitives (int, float, bool, str, or None).

YamlScalar

Top-level scalar documents are wrapped in a YamlScalar node:

doc = yarutsk.loads("42")
doc.value                              # 42 (Python int)
doc.to_dict()                          # same as .value

# Scalar style
doc = yarutsk.loads("---\n'hello'\n")
doc.style                              # 'single'
doc.style = "double"                   # 'plain'|'single'|'double'|'literal'|'folded'

# YAML tag
doc = yarutsk.loads("!!str 42")
doc.get_tag()                          # '!!str'
doc.set_tag(None)                      # clear tag

# Explicit document markers
doc = yarutsk.loads("---\n42\n...")
doc.explicit_start                     # True
doc.explicit_end                       # True
doc.explicit_start = False
doc.explicit_end   = False

YamlMapping

YamlMapping is a subclass of dict with insertion-ordered keys. All standard dict operations work directly:

# Standard dict interface (inherited)
doc["key"]                             # get (KeyError if missing)
doc["key"] = value                     # set (preserves position if key exists)
del doc["key"]                         # delete
"key" in doc                           # membership test
len(doc)                               # number of entries
for key in doc: ...                    # iterate over keys in order
doc.keys()                             # KeysView in insertion order
doc.values()                           # ValuesView in insertion order
doc.items()                            # ItemsView of (key, value) pairs
doc.get("key")                         # returns None if missing
doc.get("key", default)                # returns default if missing
doc.pop("key")                         # remove & return (KeyError if missing)
doc.pop("key", default)                # remove & return, or default
doc.setdefault("key", default)         # get or insert default
doc.update(other)                      # merge from dict or YamlMapping
doc == {"a": 1}                        # equality comparison

# Works with any dict-expecting library
isinstance(doc, dict)                  # True
json.dumps(doc)                        # works

# Conversion
doc.to_dict()                          # deep conversion to plain Python dict

# Comments
doc.get_comment_inline("key")          # -> str | None
doc.get_comment_before("key")          # -> str | None
doc.set_comment_inline("key", text)
doc.set_comment_before("key", text)

# YAML tag
doc.get_tag()                          # -> str | None  (e.g. '!!python/object:Foo')
doc.set_tag("!!map")

# Explicit document markers
doc.explicit_start                     # bool
doc.explicit_end                       # bool
doc.explicit_start = True
doc.explicit_end   = True

# Node access — returns YamlScalar/YamlMapping/YamlSequence preserving style/tag/anchor
node = doc.get_node("key")            # KeyError if absent

# Scalar style shortcut (equivalent to: doc.get_node("key").style = "single")
doc.set_scalar_style("key", "single") # 'plain'|'single'|'double'|'literal'|'folded'

# Sorting
doc.sort_keys()                        # alphabetical, in-place
doc.sort_keys(reverse=True)            # reverse alphabetical
doc.sort_keys(key=lambda k: len(k))    # custom key function on key strings
doc.sort_keys(recursive=True)          # also sort all nested mappings

YamlSequence

YamlSequence is a subclass of list. All standard list operations work directly:

# Standard list interface (inherited)
doc[0]                                 # get by index (negative indices supported)
doc[0] = value                         # set by index
del doc[0]                             # delete by index
value in doc                           # membership test
len(doc)                               # number of items
for item in doc: ...                   # iterate over items
doc.append(value)                      # add to end
doc.insert(idx, value)                 # insert before index
doc.pop()                              # remove & return last item
doc.pop(idx)                           # remove & return item at index
doc.remove(value)                      # remove first occurrence (ValueError if missing)
doc.extend(iterable)                   # append items from list or YamlSequence
doc.index(value)                       # index of first occurrence
doc.count(value)                       # number of occurrences
doc.reverse()                          # reverse in-place
doc == [1, 2, 3]                       # equality comparison

# Works with any list-expecting library
isinstance(doc, list)                  # True
json.dumps(doc)                        # works

# Conversion
doc.to_dict()                          # deep conversion to plain Python list

# Comments (addressed by integer index)
doc.get_comment_inline(idx)            # -> str | None
doc.get_comment_before(idx)            # -> str | None
doc.set_comment_inline(idx, text)
doc.set_comment_before(idx, text)

# YAML tag
doc.get_tag()                          # -> str | None  (e.g. '!!python/tuple')
doc.set_tag(None)

# Explicit document markers
doc.explicit_start                     # bool
doc.explicit_end                       # bool
doc.explicit_start = True
doc.explicit_end   = True

# Sorting (preserves comment metadata)
doc.sort()                             # natural order, in-place
doc.sort(reverse=True)
doc.sort(key=lambda v: len(v))         # custom key function on item values

Sorting preserves all comments — each entry or item carries its inline and before-key comments with it when reordered.

Benchmarks

Compare load, dump, and round-trip performance against PyYAML and ruamel.yaml across small, medium, and large inputs:

uv sync --group benchmark
uv run maturin develop --release
uv run pytest benchmarks/ -v --benchmark-sort=name

Running tests

You need Rust 1.85+ and Python 3.12+ with uv. Python 3.12 is the minimum — YamlSequence subclasses list, which requires PyO3's extends = PyList support introduced in Python 3.12.

# 1. Clone with the yaml-test-suite submodule
git clone --recurse-submodules https://github.com/theyugin/yarutsk
cd yarutsk

# 2. Create a virtual environment and install dev dependencies
uv sync --group dev

# 3. Build the extension in dev (debug) mode
uv run maturin develop

# 4. Run the suites
uv run pytest tests/ --ignore=tests/test_yaml_suite.py -v  # core library tests
uv run pytest tests/test_yaml_suite.py -q                   # yaml-test-suite compliance

test_yaml_suite.py requires the yaml-test-suite submodule. Tests that fail due to known YAML normalisation differences are marked xfail and do not count as failures.

Internals

The scanner and parser are vendored from yaml-rust2 (MIT licensed) with one targeted modification: the comment-skipping loop in the scanner now emits Comment tokens instead of discarding them. Everything else — block/flow parsing, scalar type coercion, multi-document support — comes from yaml-rust2 unchanged. The builder layer wires those tokens to the data model, and a hand-written block-style emitter serialises it back out.

YamlMapping and YamlSequence are PyO3 pyclasses that extend Python's built-in dict and list types. A Rust inner field stores the full YAML data model (including comments); the parent dict/list is kept in sync on every mutation so that all standard Python operations work transparently.

Disclaimer

This library was created with Claude Code (Anthropic). The design, implementation, tests, and this README were written by Claude under human direction.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yarutsk-0.1.0.tar.gz (115.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

yarutsk-0.1.0-cp314-cp314-win_amd64.whl (370.7 kB view details)

Uploaded CPython 3.14Windows x86-64

yarutsk-0.1.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (481.6 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

yarutsk-0.1.0-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (908.9 kB view details)

Uploaded CPython 3.14macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

yarutsk-0.1.0-cp313-cp313-win_amd64.whl (371.8 kB view details)

Uploaded CPython 3.13Windows x86-64

yarutsk-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (482.9 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

yarutsk-0.1.0-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (911.2 kB view details)

Uploaded CPython 3.13macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

yarutsk-0.1.0-cp312-cp312-win_amd64.whl (372.1 kB view details)

Uploaded CPython 3.12Windows x86-64

yarutsk-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (483.9 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

yarutsk-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (911.7 kB view details)

Uploaded CPython 3.12macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

File details

Details for the file yarutsk-0.1.0.tar.gz.

File metadata

  • Download URL: yarutsk-0.1.0.tar.gz
  • Upload date:
  • Size: 115.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.1.0.tar.gz
Algorithm Hash digest
SHA256 33d1d1de7c54cc14a9d6121c8832ccf41899d93c990cf6f3724af0bc0eba9efd
MD5 505723694284df744964b9eaf7872d1e
BLAKE2b-256 d2e4c017b3a3d292443a7f4d6e80cc8b4f4ff5e8e999e219f097a822901afe8f

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0.tar.gz:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.0-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: yarutsk-0.1.0-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 370.7 kB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.1.0-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 dabd7646b8181c9df1b2c4d12ff02afb9cc996a8e75d7de3cad8fb7ac79ba322
MD5 791df96592b1cb5327fcbb41ad2b3d78
BLAKE2b-256 b4017b4e701588d9f55bec920832646b5dfb64df08696240e2f5dde244a3bcca

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0-cp314-cp314-win_amd64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f76842eecdca7ae0331dffdf372bd4a90955dad37acedb12fa4c2bd7dbbbdafa
MD5 669684a91da988515f8918da32dcda34
BLAKE2b-256 ec18c669c9d30c8ab1193cf8a02c354f0aa8586e9fb2d47aff694d242bd4642d

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.0-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.0-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 256622013ca88a6322c3069cfefb3ec67d8605c51f3062b5948447e2b0ba63d1
MD5 8b4ca60582bdfd47ec4cc6b6f9c42355
BLAKE2b-256 18815027748d31018dde2921c1f8abf430204d2c81cd788848a10a9dd2dfff3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: yarutsk-0.1.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 371.8 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 02dec2aa30b7deca46a058142ce3d486a80efc11c36f7dab07f585ab9989eb27
MD5 b6d64e5b028f2baf8df6476fec3e4680
BLAKE2b-256 35efd08a59fe908e360d8e9f62ac17a02eee773b2a13e81b740a3596dcc99a53

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0-cp313-cp313-win_amd64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cabb72f21d173ecf9dee7eef4f14163a79495b69c46c57129fb8c0299451a567
MD5 7923e835559924fb0f32f83ffee7845a
BLAKE2b-256 fc9985c95165d624ca40b66501f0113ab8d120409ee92be00e6a06e9ee9148a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.0-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.0-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 226a9ec59e82a4f8b1d3b7f3d7299c81dede60f75cd54f7dfb03081c3c1dcfdd
MD5 d0f3a6637227beb19da33dbe4c963032
BLAKE2b-256 4629033c191b25205abda2837e6e3d91749939cb5c16c14d6beab3729a0fbb98

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: yarutsk-0.1.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 372.1 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 f2891eda55b67cdc4868f85116140b3f307d75b298e562b6dc9f65a5705e6ad7
MD5 679dda10743abf5dde876b636eb8ce78
BLAKE2b-256 8849758f21db9a261fe83fd645f4a130fe848e78063b862627c87fc4b4ddb9a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0-cp312-cp312-win_amd64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 87e07e4d3f700fdaa4806f81b04a00a98dbc69a1e5695932830af33c0e55ad81
MD5 e3ce5b7ce5af072d4a062d13672a986e
BLAKE2b-256 d9f28b567186b0676d5e2dc04015108affd9d1c19d7cb6b685c2f5f884c999b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 fb5b9ee990dbe9ce1d000ae1a7508505f11865cbdebe7ab033037d522eaf53c0
MD5 ad541cbc858bfe47126084ac5b1827f0
BLAKE2b-256 7f04db0d59d59247683e7f33bfee99eb4c48c2237dac4dc9b7553f47d2c2dd05

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page