Skip to main content

Yet Another YAML AST - programmatically transform YAML, preserving whitespace and comments

Project description

yaya

Yet Another YAML AST - programmatically transform YAML, preserving whitespace and comments

PyPI version

Why?

Programmatically edit YAML at the AST level, so re-serializing doesn't introduce extraneous changes. Preserves:

  • Comments
  • Whitespace (including trailing spaces)
  • Quote styles (', ", or none)
    • By default, will switch between ' and " if the other is added to a string (as a literal character)
  • Block scalar indicators (|, |-, |+)
  • Other formatting choices (e.g. indentation)

Other libraries (e.g. ruamel.yaml) make formatting changes when serializing. yaya avoids this by:

  1. Parsing YAML to get the AST (with position information)
  2. Applying modifications only to specific values or subtrees
  3. Leaving everything else untouched

It also tries to mimic neighboring formatting, when adding values/trees, while also supporting dict-like ergonomics and path-based navigation.

Installation

pip install lossless-yaml

Usage

Basic String Replacement

from yaya import YAYA

# Load a YAML file
doc = YAYA.load('.github/workflows/test.yaml')

# Simple string replacement in all values
doc.replace_in_values('src/marin', 'lib/marin/src/marin')

# Regex-based replacement
doc.replace_in_values_regex(r'\buv sync(?! --package)', 'uv sync --package myapp')

doc.save()

Path-Based Navigation and Assertions

# Navigate using paths
runs_on = doc.get_path("jobs.test.runs-on")
step_name = doc.get_path("jobs.test.steps[0].name")

# Or dict-like access
runs_on = doc["jobs"]["test"]["runs-on"]

# Assert values before making changes
doc.assert_value("on", ["push"])
doc.assert_absent("jobs.test.defaults")
doc.assert_present("jobs.test.steps")

Replacing Values or Subtrees

# Replace a simple value
doc.replace_key("jobs.test.runs-on", "ubuntu-22.04")

# Replace a list item
doc.replace_key("build.commands[1]", "uv sync --package marin --frozen")

# Replace with a complex structure
doc.replace_key("on", {
    "push": {
        "branches": ["main"],
        "paths": ["lib/**", "uv.lock"]
    },
    "pull_request": {
        "paths": ["lib/**", "uv.lock"]
    }
})

doc.save()

Adding Keys

# Add key after another (maintains order)
doc.add_key_after("jobs.test.runs-on", "defaults", {
    "run": {
        "working-directory": "lib/myapp"
    }
})

# Add or replace (convenience method)
doc.ensure_key("jobs.test.timeout-minutes", 30)

doc.save()

Example

Given this YAML file:

# Production config
database:
  host: prod-db-1.example.com
  port: 5432

This code:

doc = YAYA.load('config.yaml')
doc.replace_in_values('prod-db-1', 'prod-db-2')
doc.save()

Produces exactly:

# Production config
database:
  host: prod-db-2.example.com
  port: 5432

No reformatting. No comment loss. Just the change you made.

How It Works

  1. Parse YAML with ruamel.yaml to get AST + position information
  2. Convert line/column positions to byte offsets
  3. Track modifications as you change values
  4. Apply byte-level replacements when saving, leaving everything else untouched

Features

  • Byte-for-byte preservation of unchanged content
  • String replacement (literal and regex)
  • Path-based navigation (jobs.test.steps[0].name)
  • Replace values or subtrees (scalars, dicts, lists, list items)
  • Add keys with proper positioning
  • Assertions for validation (assert_value, assert_present, assert_absent)
  • Comment preservation
  • Block scalar support
  • Flow and block style handling

Limitations

  • Removing keys not yet implemented
  • Binary data not supported
  • Adding keys only supports add_key_after currently (not arbitrary positions)

Comparison with ruamel.yaml

ruamel.yaml is excellent for round-trip YAML editing and preserves most formatting. However:

Feature ruamel.yaml yaya
Preserves comments
Preserves most whitespace
Byte-for-byte identical
Trailing whitespace
Block scalar indicators ❌ (computes new ones)

yaya uses ruamel.yaml under the hood but takes a different approach: instead of serializing the AST back to YAML, it modifies the original bytes directly.

License

MIT

Contributing

Issues and pull requests welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lossless_yaml-0.1.1.tar.gz (32.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lossless_yaml-0.1.1-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file lossless_yaml-0.1.1.tar.gz.

File metadata

  • Download URL: lossless_yaml-0.1.1.tar.gz
  • Upload date:
  • Size: 32.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lossless_yaml-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5f69b976b01a477700ff9fd784127ed766d1cc83112da59577212d7c65da0081
MD5 7a8ba2ce5c6ce011f6c43e801712319f
BLAKE2b-256 f8462aa56c59edade0ae305c0ae9b70d4da305e2f6f3b2f557f90afed4a4ff3c

See more details on using hashes here.

File details

Details for the file lossless_yaml-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: lossless_yaml-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lossless_yaml-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b0f2c0092b3e5c99c2dc237f73f8df19a0f5bf0be1ba7ff72ab605b07f8cb646
MD5 ca7fdff5e6b1b8b1281abb5c3e1a2a0b
BLAKE2b-256 4a4c7262452afb0e932115d26f7d5c02e595520f4ab7e681f3dc96c4903ff1b2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page