Skip to main content

A YAML round-trip library that preserves comments and insertion order

Project description

yarutsk

A Python YAML library that round-trips documents while preserving comments, insertion order, scalar styles, tags, anchors and aliases, blank lines, and explicit document markers.

What it does

Most YAML libraries silently drop comments on load. yarutsk keeps them attached to their keys — both inline (key: value # like this) and block-level (# above a key) — so a load → modify → dump cycle leaves the rest of the file intact.

import io
import yarutsk

doc = yarutsk.load(io.StringIO("""
# database config
host: localhost  # primary
port: 5432
"""))

doc["port"] = 5433

out = io.StringIO()
yarutsk.dump(doc, out)
print(out.getvalue())
# # database config
# host: localhost  # primary
# port: 5433

YamlMapping is a subclass of dict and YamlSequence is a subclass of list, so they work everywhere a dict or list is expected:

import json

doc = yarutsk.loads("name: Alice\nscores: [10, 20, 30]")

isinstance(doc, dict)           # True
isinstance(doc["scores"], list) # True
json.dumps(doc)                 # '{"name": "Alice", "scores": [10, 20, 30]}'

Round-trip fidelity

yarutsk reproduces the source text exactly for everything it understands. A loads followed by dumps gives back the original string byte-for-byte in the common case:

src = """\
defaults: &base
  timeout: 30
  retries: 3

service:
  name: api
  config: *base
"""
assert yarutsk.dumps(yarutsk.loads(src)) == src

Specifically preserved:

  • Scalar styles — plain, 'single-quoted', "double-quoted", literal block |, folded block >
  • Non-canonical scalarsyes/no/on/off, ~, Null, True/False, 0xFF, 0o77 — reproduced as written, not re-canonicalised to true/false/null/255
  • YAML tags!!str, !!python/tuple, and any custom tag are emitted back verbatim
  • Anchors and aliases&name on the anchor node and *name for references are preserved; the Python layer returns the resolved value transparently
  • Blank lines between mapping entries and sequence items
  • Explicit document markers--- and ...

Installation

Built with Maturin. From the repo root:

pip install maturin
maturin develop

API

Loading and dumping

# Load from stream (StringIO / BytesIO)
doc  = yarutsk.load(stream)            # first document
docs = yarutsk.load_all(stream)        # all documents as a list

# Load from string
doc  = yarutsk.loads(text)
docs = yarutsk.loads_all(text)

# Dump to stream
yarutsk.dump(doc, stream)
yarutsk.dump_all(docs, stream)

# Dump to string
text = yarutsk.dumps(doc)
text = yarutsk.dumps_all(docs)

load / loads return a YamlMapping, YamlSequence, YamlScalar, or None (for empty input). Nested nodes are also YamlMapping or YamlSequence; scalar leaves are returned as native Python primitives (int, float, bool, str, or None).

YamlScalar

Top-level scalar documents are wrapped in a YamlScalar node:

doc = yarutsk.loads("42")
doc.value                              # 42 (Python int)
doc.to_dict()                          # same as .value

# Scalar style
doc = yarutsk.loads("---\n'hello'\n")
doc.style                              # 'single'
doc.style = "double"                   # 'plain'|'single'|'double'|'literal'|'folded'

# YAML tag
doc = yarutsk.loads("!!str 42")
doc.get_tag()                          # '!!str'
doc.set_tag(None)                      # clear tag

# Explicit document markers
doc = yarutsk.loads("---\n42\n...")
doc.explicit_start                     # True
doc.explicit_end                       # True
doc.explicit_start = False
doc.explicit_end   = False

YamlMapping

YamlMapping is a subclass of dict with insertion-ordered keys. All standard dict operations work directly:

# Standard dict interface (inherited)
doc["key"]                             # get (KeyError if missing)
doc["key"] = value                     # set (preserves position if key exists)
del doc["key"]                         # delete
"key" in doc                           # membership test
len(doc)                               # number of entries
for key in doc: ...                    # iterate over keys in order
doc.keys()                             # KeysView in insertion order
doc.values()                           # ValuesView in insertion order
doc.items()                            # ItemsView of (key, value) pairs
doc.get("key")                         # returns None if missing
doc.get("key", default)                # returns default if missing
doc.pop("key")                         # remove & return (KeyError if missing)
doc.pop("key", default)                # remove & return, or default
doc.setdefault("key", default)         # get or insert default
doc.update(other)                      # merge from dict or YamlMapping
doc == {"a": 1}                        # equality comparison

# Works with any dict-expecting library
isinstance(doc, dict)                  # True
json.dumps(doc)                        # works

# Conversion
doc.to_dict()                          # deep conversion to plain Python dict

# Comments
doc.get_comment_inline("key")          # -> str | None
doc.get_comment_before("key")          # -> str | None
doc.set_comment_inline("key", text)
doc.set_comment_before("key", text)

# YAML tag
doc.get_tag()                          # -> str | None  (e.g. '!!python/object:Foo')
doc.set_tag("!!map")

# Explicit document markers
doc.explicit_start                     # bool
doc.explicit_end                       # bool
doc.explicit_start = True
doc.explicit_end   = True

# Node access — returns YamlScalar/YamlMapping/YamlSequence preserving style/tag/anchor
node = doc.get_node("key")            # KeyError if absent

# Scalar style shortcut (equivalent to: doc.get_node("key").style = "single")
doc.set_scalar_style("key", "single") # 'plain'|'single'|'double'|'literal'|'folded'

# Sorting
doc.sort_keys()                        # alphabetical, in-place
doc.sort_keys(reverse=True)            # reverse alphabetical
doc.sort_keys(key=lambda k: len(k))    # custom key function on key strings
doc.sort_keys(recursive=True)          # also sort all nested mappings

YamlSequence

YamlSequence is a subclass of list. All standard list operations work directly:

# Standard list interface (inherited)
doc[0]                                 # get by index (negative indices supported)
doc[0] = value                         # set by index
del doc[0]                             # delete by index
value in doc                           # membership test
len(doc)                               # number of items
for item in doc: ...                   # iterate over items
doc.append(value)                      # add to end
doc.insert(idx, value)                 # insert before index
doc.pop()                              # remove & return last item
doc.pop(idx)                           # remove & return item at index
doc.remove(value)                      # remove first occurrence (ValueError if missing)
doc.extend(iterable)                   # append items from list or YamlSequence
doc.index(value)                       # index of first occurrence
doc.count(value)                       # number of occurrences
doc.reverse()                          # reverse in-place
doc == [1, 2, 3]                       # equality comparison

# Works with any list-expecting library
isinstance(doc, list)                  # True
json.dumps(doc)                        # works

# Conversion
doc.to_dict()                          # deep conversion to plain Python list

# Comments (addressed by integer index)
doc.get_comment_inline(idx)            # -> str | None
doc.get_comment_before(idx)            # -> str | None
doc.set_comment_inline(idx, text)
doc.set_comment_before(idx, text)

# YAML tag
doc.get_tag()                          # -> str | None  (e.g. '!!python/tuple')
doc.set_tag(None)

# Explicit document markers
doc.explicit_start                     # bool
doc.explicit_end                       # bool
doc.explicit_start = True
doc.explicit_end   = True

# Sorting (preserves comment metadata)
doc.sort()                             # natural order, in-place
doc.sort(reverse=True)
doc.sort(key=lambda v: len(v))         # custom key function on item values

Sorting preserves all comments — each entry or item carries its inline and before-key comments with it when reordered.

Benchmarks

Compare load, dump, and round-trip performance against PyYAML and ruamel.yaml across small, medium, and large inputs:

uv sync --group benchmark
uv run maturin develop --release
uv run pytest benchmarks/ -v --benchmark-sort=name

Running tests

You need Rust 1.85+ and Python 3.12+ with uv. Python 3.12 is the minimum — YamlSequence subclasses list, which requires PyO3's extends = PyList support introduced in Python 3.12.

# 1. Clone with the yaml-test-suite submodule
git clone --recurse-submodules https://github.com/theyugin/yarutsk
cd yarutsk

# 2. Create a virtual environment and install dev dependencies
uv sync --group dev

# 3. Build the extension in dev (debug) mode
uv run maturin develop

# 4. Run the suites
uv run pytest tests/ --ignore=tests/test_yaml_suite.py -v  # core library tests
uv run pytest tests/test_yaml_suite.py -q                   # yaml-test-suite compliance

test_yaml_suite.py requires the yaml-test-suite submodule. Tests that fail due to known YAML normalisation differences are marked xfail and do not count as failures.

Internals

The scanner and parser are vendored from yaml-rust2 (MIT licensed) with one targeted modification: the comment-skipping loop in the scanner now emits Comment tokens instead of discarding them. Everything else — block/flow parsing, scalar type coercion, multi-document support — comes from yaml-rust2 unchanged. The builder layer wires those tokens to the data model, and a hand-written block-style emitter serialises it back out.

YamlMapping and YamlSequence are PyO3 pyclasses that extend Python's built-in dict and list types. A Rust inner field stores the full YAML data model (including comments); the parent dict/list is kept in sync on every mutation so that all standard Python operations work transparently.

Disclaimer

This library was created with Claude Code (Anthropic). The design, implementation, tests, and this README were written by Claude under human direction.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yarutsk-0.1.1.tar.gz (118.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

yarutsk-0.1.1-cp314-cp314-win_amd64.whl (390.4 kB view details)

Uploaded CPython 3.14Windows x86-64

yarutsk-0.1.1-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (499.5 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

yarutsk-0.1.1-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (933.8 kB view details)

Uploaded CPython 3.14macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

yarutsk-0.1.1-cp313-cp313-win_amd64.whl (393.1 kB view details)

Uploaded CPython 3.13Windows x86-64

yarutsk-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (501.6 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

yarutsk-0.1.1-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (937.2 kB view details)

Uploaded CPython 3.13macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

yarutsk-0.1.1-cp312-cp312-win_amd64.whl (393.6 kB view details)

Uploaded CPython 3.12Windows x86-64

yarutsk-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (502.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

yarutsk-0.1.1-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (938.8 kB view details)

Uploaded CPython 3.12macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

File details

Details for the file yarutsk-0.1.1.tar.gz.

File metadata

  • Download URL: yarutsk-0.1.1.tar.gz
  • Upload date:
  • Size: 118.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.1.1.tar.gz
Algorithm Hash digest
SHA256 480b6197697960366957f69f49e423eaa9de121c6db74ff2d77b0fb0a0d23aeb
MD5 b032311a22e67246cb886f073e81ed43
BLAKE2b-256 1fa30405d764ba1d359c9fa70d8880e677de734658b9852b197ba54c27042570

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1.tar.gz:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.1-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: yarutsk-0.1.1-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 390.4 kB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.1.1-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 cd8f588b776dd14ee281dd1585e96354104ace414fc48902b26f2bb9e03ee3b6
MD5 f9de0d024add8fd99f2b518e0dcf293f
BLAKE2b-256 2227b0dd302cdfb3c90debce180ab84e432e8f2270c3a5b2183b91c8b6ec759a

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1-cp314-cp314-win_amd64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.1-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.1-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b934cd818049796d548a54db30cc4ea05efa33dcb6dc55a6cebc8bc136b1705a
MD5 2fa1a7a576fcac696c607845560cc41b
BLAKE2b-256 341d1cd4c997df17179f03944352ed72526a929d7165c576eec72d728d736815

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.1-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.1-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 f0ec6c53cb9b5fc9a25dbbf921e39e2551f89a81ff0b5641b55fb38db1355744
MD5 6d6ecaefe2e195ca611ae3b7d54e2b7e
BLAKE2b-256 deb6015b31af25928ffd78864b6a5c2986fed6a879355ce6c9f2ddba2a428c64

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: yarutsk-0.1.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 393.1 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.1.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 2f2fad4296970a26fd0ca5ca2d444ad927001d625657ea666770da595522ebf6
MD5 665e92edc98ad7685fcba5d9c9a1bc36
BLAKE2b-256 9cb4d7f0ab0e762d7ec4d74793a8e8bc40a5784d7082bdbcaf8cb7a0e6343f40

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1-cp313-cp313-win_amd64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c23b22d7a6cd16783e12509febe248df2ae961e5a7472251e26ff3c6cd894383
MD5 e4f81c37fc9ffa4e268c62df79387371
BLAKE2b-256 bec99aee99895c37b2d0aaf4b63b5e83b67787ba15fe461a0b14c5df75547c9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.1-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.1-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 8c5fcec79cdfac356e4efa1142049491f7d5bcb32193faee0345aeaf64c3927a
MD5 0b00f516d9f36990a457319ad015ddb6
BLAKE2b-256 39e46aaef59807e769206e9e54bfb899e397f7483eddcb860dccd8b487f33d42

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: yarutsk-0.1.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 393.6 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.1.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 7da58b0ff29fd454a3ee2c4a47dc02ed290d7d541a1c8ff942ea6de062816746
MD5 4c5d73c4dbd288ed967ec7416239fa84
BLAKE2b-256 532efe80ae75645435a8b4f7e2d7dd9a0e16963d086060a954ca5d9166ee523c

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1-cp312-cp312-win_amd64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d59b41d7a54f1e79817ac757ceb7c8ba2c1cdd26fe36c0f233620294a4666964
MD5 47caf74697a67ea7eacecc9060f28d16
BLAKE2b-256 6339c609526cbd80939a7e041d1b09aed823028aac4126025e0e2e480cbae175

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.1.1-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for yarutsk-0.1.1-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 d0b87d9b3f0f0f3ccae791763e5cf7ff23812d8d16bb34215adcdc390b599573
MD5 3e3fe8a4150a19cdf523346b016efb4e
BLAKE2b-256 36486cb88f86e65c9fd6e8316fbd945691394d5f1a560d4af6453b323ba3f825

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.1.1-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page