Skip to main content

A YAML round-trip library that preserves comments and insertion order

Project description

yarutsk

A Python YAML library that round-trips documents while preserving comments, insertion order, scalar styles, tags, anchors and aliases, blank lines, and explicit document markers.

What it does

Most YAML libraries silently drop comments on load. yarutsk keeps them attached to their keys — both inline (key: value # like this) and block-level (# above a key) — so a load → modify → dump cycle leaves the rest of the file intact.

import io
import yarutsk

doc = yarutsk.load(io.StringIO("""
# database config
host: localhost  # primary
port: 5432
"""))

doc["port"] = 5433

out = io.StringIO()
yarutsk.dump(doc, out)
print(out.getvalue())
# # database config
# host: localhost  # primary
# port: 5433

YamlMapping is a subclass of dict and YamlSequence is a subclass of list, so they work everywhere a dict or list is expected:

import json

doc = yarutsk.loads("name: Alice\nscores: [10, 20, 30]")

isinstance(doc, dict)           # True
isinstance(doc["scores"], list) # True
json.dumps(doc)                 # '{"name": "Alice", "scores": [10, 20, 30]}'

Round-trip fidelity

yarutsk reproduces the source text exactly for everything it understands. A loads followed by dumps gives back the original string byte-for-byte in the common case:

src = """\
defaults: &base
  timeout: 30
  retries: 3

service:
  name: api
  config: *base
"""
assert yarutsk.dumps(yarutsk.loads(src)) == src

Specifically preserved:

  • Scalar styles — plain, 'single-quoted', "double-quoted", literal block |, folded block >
  • Non-canonical scalarsyes/no/on/off, ~, Null, True/False, 0xFF, 0o77 — reproduced as written, not re-canonicalised to true/false/null/255
  • YAML tags!!str, !!python/tuple, and any custom tag are emitted back verbatim
  • Anchors and aliases&name on the anchor node and *name for references are preserved; the Python layer returns the resolved value transparently
  • Blank lines between mapping entries and sequence items
  • Explicit document markers--- and ...

Installation

Built with Maturin. From the repo root:

pip install maturin
maturin develop

API

Loading and dumping

# Load from stream (StringIO / BytesIO)
doc  = yarutsk.load(stream)            # first document
docs = yarutsk.load_all(stream)        # all documents as a list

# Load from string
doc  = yarutsk.loads(text)
docs = yarutsk.loads_all(text)

# Dump to stream
yarutsk.dump(doc, stream)
yarutsk.dump_all(docs, stream)

# Dump to string
text = yarutsk.dumps(doc)
text = yarutsk.dumps_all(docs)

load / loads return a YamlMapping, YamlSequence, YamlScalar, or None (for empty input). Nested nodes are also YamlMapping or YamlSequence; scalar leaves are returned as native Python primitives (int, float, bool, str, or None).

YamlScalar

Top-level scalar documents are wrapped in a YamlScalar node:

doc = yarutsk.loads("42")
doc.value                              # 42 (Python int)
doc.to_dict()                          # same as .value

# Scalar style
doc = yarutsk.loads("---\n'hello'\n")
doc.style                              # 'single'
doc.style = "double"                   # 'plain'|'single'|'double'|'literal'|'folded'

# YAML tag
doc = yarutsk.loads("!!str 42")
doc.get_tag()                          # '!!str'
doc.set_tag(None)                      # clear tag

# Explicit document markers
doc = yarutsk.loads("---\n42\n...")
doc.explicit_start                     # True
doc.explicit_end                       # True
doc.explicit_start = False
doc.explicit_end   = False

YamlMapping

YamlMapping is a subclass of dict with insertion-ordered keys. All standard dict operations work directly:

# Standard dict interface (inherited)
doc["key"]                             # get (KeyError if missing)
doc["key"] = value                     # set (preserves position if key exists)
del doc["key"]                         # delete
"key" in doc                           # membership test
len(doc)                               # number of entries
for key in doc: ...                    # iterate over keys in order
doc.keys()                             # KeysView in insertion order
doc.values()                           # ValuesView in insertion order
doc.items()                            # ItemsView of (key, value) pairs
doc.get("key")                         # returns None if missing
doc.get("key", default)                # returns default if missing
doc.pop("key")                         # remove & return (KeyError if missing)
doc.pop("key", default)                # remove & return, or default
doc.setdefault("key", default)         # get or insert default
doc.update(other)                      # merge from dict or YamlMapping
doc == {"a": 1}                        # equality comparison

# Works with any dict-expecting library
isinstance(doc, dict)                  # True
json.dumps(doc)                        # works

# Conversion
doc.to_dict()                          # deep conversion to plain Python dict

# Comments
doc.get_comment_inline("key")          # -> str | None
doc.get_comment_before("key")          # -> str | None
doc.set_comment_inline("key", text)
doc.set_comment_before("key", text)

# YAML tag
doc.get_tag()                          # -> str | None  (e.g. '!!python/object:Foo')
doc.set_tag("!!map")

# Explicit document markers
doc.explicit_start                     # bool
doc.explicit_end                       # bool
doc.explicit_start = True
doc.explicit_end   = True

# Node access — returns YamlScalar/YamlMapping/YamlSequence preserving style/tag/anchor
node = doc.get_node("key")            # KeyError if absent

# Scalar style shortcut (equivalent to: doc.get_node("key").style = "single")
doc.set_scalar_style("key", "single") # 'plain'|'single'|'double'|'literal'|'folded'

# Sorting
doc.sort_keys()                        # alphabetical, in-place
doc.sort_keys(reverse=True)            # reverse alphabetical
doc.sort_keys(key=lambda k: len(k))    # custom key function on key strings
doc.sort_keys(recursive=True)          # also sort all nested mappings

YamlSequence

YamlSequence is a subclass of list. All standard list operations work directly:

# Standard list interface (inherited)
doc[0]                                 # get by index (negative indices supported)
doc[0] = value                         # set by index
del doc[0]                             # delete by index
value in doc                           # membership test
len(doc)                               # number of items
for item in doc: ...                   # iterate over items
doc.append(value)                      # add to end
doc.insert(idx, value)                 # insert before index
doc.pop()                              # remove & return last item
doc.pop(idx)                           # remove & return item at index
doc.remove(value)                      # remove first occurrence (ValueError if missing)
doc.extend(iterable)                   # append items from list or YamlSequence
doc.index(value)                       # index of first occurrence
doc.count(value)                       # number of occurrences
doc.reverse()                          # reverse in-place
doc == [1, 2, 3]                       # equality comparison

# Works with any list-expecting library
isinstance(doc, list)                  # True
json.dumps(doc)                        # works

# Conversion
doc.to_dict()                          # deep conversion to plain Python list

# Comments (addressed by integer index)
doc.get_comment_inline(idx)            # -> str | None
doc.get_comment_before(idx)            # -> str | None
doc.set_comment_inline(idx, text)
doc.set_comment_before(idx, text)

# YAML tag
doc.get_tag()                          # -> str | None  (e.g. '!!python/tuple')
doc.set_tag(None)

# Explicit document markers
doc.explicit_start                     # bool
doc.explicit_end                       # bool
doc.explicit_start = True
doc.explicit_end   = True

# Sorting (preserves comment metadata)
doc.sort()                             # natural order, in-place
doc.sort(reverse=True)
doc.sort(key=lambda v: len(v))         # custom key function on item values

Sorting preserves all comments — each entry or item carries its inline and before-key comments with it when reordered.

Running tests

You need Rust (nightly) and Python 3.12+ with uv. Python 3.12 is the minimum — YamlSequence subclasses list, which requires PyO3's extends = PyList support introduced in Python 3.12.

# 1. Clone with the yaml-test-suite submodule
git clone --recurse-submodules https://github.com/theyugin/yarutsk
cd yarutsk

# 2. Create a virtual environment and install dev dependencies
uv sync --group dev

# 3. Build the extension in dev (debug) mode
uv run maturin develop

# 4. Run the suites
uv run pytest tests/ --ignore=tests/test_yaml_suite.py -v  # core library tests
uv run pytest tests/test_yaml_suite.py -q                   # yaml-test-suite compliance

test_yaml_suite.py requires the yaml-test-suite submodule. Tests that fail due to known YAML normalisation differences are marked xfail and do not count as failures.

Internals

The scanner and parser are vendored from yaml-rust2 (MIT licensed) with one targeted modification: the comment-skipping loop in the scanner now emits Comment tokens instead of discarding them. Everything else — block/flow parsing, scalar type coercion, multi-document support — comes from yaml-rust2 unchanged. The builder layer wires those tokens to the data model, and a hand-written block-style emitter serialises it back out.

YamlMapping and YamlSequence are PyO3 pyclasses that extend Python's built-in dict and list types. A Rust inner field stores the full YAML data model (including comments); the parent dict/list is kept in sync on every mutation so that all standard Python operations work transparently.

Disclaimer

This library was created with Claude Code (Anthropic). The design, implementation, tests, and this README were written by Claude under human direction.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yarutsk-0.0.11.tar.gz (97.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

yarutsk-0.0.11-cp314-cp314-win_amd64.whl (355.9 kB view details)

Uploaded CPython 3.14Windows x86-64

yarutsk-0.0.11-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (473.1 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

yarutsk-0.0.11-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (893.0 kB view details)

Uploaded CPython 3.14macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

yarutsk-0.0.11-cp313-cp313-win_amd64.whl (357.2 kB view details)

Uploaded CPython 3.13Windows x86-64

yarutsk-0.0.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (474.3 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

yarutsk-0.0.11-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (901.5 kB view details)

Uploaded CPython 3.13macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

yarutsk-0.0.11-cp312-cp312-win_amd64.whl (357.8 kB view details)

Uploaded CPython 3.12Windows x86-64

yarutsk-0.0.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (474.9 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

yarutsk-0.0.11-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (902.7 kB view details)

Uploaded CPython 3.12macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

File details

Details for the file yarutsk-0.0.11.tar.gz.

File metadata

  • Download URL: yarutsk-0.0.11.tar.gz
  • Upload date:
  • Size: 97.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.0.11.tar.gz
Algorithm Hash digest
SHA256 3e64845f7783e6d9ae8f9a85a986db7e4965993259bb0f4ac385094cd8f2aaf8
MD5 12be9bfef92c4befaacbc5f342234a8f
BLAKE2b-256 8e2ed118d07ba7b09e54fe99356eaca2bc4b85454ccba8bc9c751667ffcb2335

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11.tar.gz:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.0.11-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: yarutsk-0.0.11-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 355.9 kB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.0.11-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 1063e0b6de7ec5b9dc0a3e5f27ac4b921ef6c629cba615784741f54856492ba6
MD5 7c892ccaa72821424409602490be1bd4
BLAKE2b-256 9ec1374ba5cc455a2b6e42d95bcc466d4f25e1023cdfb2e4bf429ac26daec4d3

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11-cp314-cp314-win_amd64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.0.11-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for yarutsk-0.0.11-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 742872114ff2e8985c151e9a58efdac22166d318c473d0ee502f783ce2370eed
MD5 7cc6a447fb0d83b9b3936d14b934bd5d
BLAKE2b-256 07b4e4cb907592342bed6134014a4009c8a73e50eeac63e007fd11a7d37852e7

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.0.11-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for yarutsk-0.0.11-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 ae708aa4f59483fa685c9f9eb1b8550581969a4bbbf0af92d6e5908028fb3761
MD5 39d623ccae50f4fcef6dcfa58f0f3d7e
BLAKE2b-256 abef1657e3ce23a93ce8d65f5ba8053765436032aef48f0e73cb2ae7160ec0df

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.0.11-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: yarutsk-0.0.11-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 357.2 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.0.11-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 182a1e651eafe79a29dac5d95bfbaabfc9cd499d17890b91b7841806ac052607
MD5 d511743d7bade38cf1312d746ef87d37
BLAKE2b-256 e3cf22172fbf375ae295d7c97622a2c98bc5aabac26eca0d6c661e83693cbcdc

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11-cp313-cp313-win_amd64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.0.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for yarutsk-0.0.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c28f5a56386d4534ca83f380a520678d90c141a411359605cef6e9cad57c8542
MD5 bf61482437894617442cad1414d681b3
BLAKE2b-256 5bc0e43c5a80b00ecf4a1084d02e49e84951ba56b4cf67b460c909f659330910

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.0.11-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for yarutsk-0.0.11-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 b6db48b04f1d50cf451f5fc953ee6cee4e647cceffbaa6d2c8b4ab09aef271eb
MD5 7f617f5f0dda99ff934102c475032827
BLAKE2b-256 d8c15db4af69eb5ac2f1af11c4490ee38361011d9a40d4c1a0c423168ae2db5b

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.0.11-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: yarutsk-0.0.11-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 357.8 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yarutsk-0.0.11-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 9eee0386250c887acfd3edcce3763f8200ccda1fe0a1b2b730d59c1c7461f8df
MD5 725a32338c3f44f5477aebd022d0752b
BLAKE2b-256 bccd596d24d8a9c41bde3bce1fd127f248cbe6fccd9f10aa284ca2d25f92e626

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11-cp312-cp312-win_amd64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.0.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for yarutsk-0.0.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 08e63604300b322612cf221b73ae13ddb9b978939d67631d5bb61635e3952bbb
MD5 e759ab5b3cce58a0dc3c37f4de0b516c
BLAKE2b-256 bc7e4b1a14ea3212502808fe1a24295d3de7990e4e62606cfe6e968e103822f2

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yarutsk-0.0.11-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for yarutsk-0.0.11-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 b4f5a1508756e1202bc97965226892acceefc445daee675d6e6cbd5e1018e2b8
MD5 c9cbcc51b688b9cd39fb311b0a625f5a
BLAKE2b-256 ad49c094bd4ac4273ec3053ded3e59b356278b0eea41886c899a2280ea42374a

See more details on using hashes here.

Provenance

The following attestation bundles were made for yarutsk-0.0.11-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: ci.yml on theyugin/yarutsk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page