Read and change YAML/Compatible data using powerful, intuitive, command-line friendly syntax
Project description
YAML Path and Command-Line Tools
Along with providing a standard for defining YAML Paths, this project aims to provide generally-useful command-line tools which implement YAML Paths. These bring intuitive YAML, EYAML, JSON, and compatible data parsing and editing capabilties to the command-line. It is also a Python library for other projects to readily employ YAML Paths.
Contents
- Introduction
- Illustration
- Installing
- Supported YAML Path Segments
- Based on ruamel.yaml and Python 3
- The Files of This Project
- Basic Usage
Introduction
This project presents and utilizes YAML Paths, which are a powerful, intuitive means of identifying one or more nodes within YAML, EYAML, or compatible data structures like JSON. Both dot-notation (inspired by Hiera) and forward-slash-notation (influenced by XPath) are supported. The libraries (modules) and several command-line tool implementations are provided. With these, you can build YAML Path support right into your own application or easily use its capabilities right away from the command-line to retrieve or update YAML/Compatible data.
This implementation of YAML Path is a query language in addition to a node descriptor. With it, you can describe or select a single precise node or search for any number of nodes that match some criteria. Keys, values, and elements can all be searched at any number of levels within the data structure using the same query. Collectors can also be used to gather and further select from otherwise disparate parts of the source data.
The project Wiki provides a deeper dive into these concepts.
Illustration
To illustrate some of these concepts, consider these samples:
---
hash:
child_attr:
key: 5280
This value, 5280
, can be identified via YAML Path as any of:
hash.child_attr.key
(dot-notation)hash.child_attr[.=key]
(search all child keys for one named,key
, and yield its value)/hash/child_attr/key
(same as 1 but in forward-slash notation)/hash/child_attr[.=key]
(same as 2 but in forward-slash notation)
---
aliases:
- &first_anchor Simple string value
With YAML Path, you can select this anchored value by any of these equivalent expressions:
aliases[0]
(explicit array element number)aliases.0
(implicit array element number in dot-notation)aliases[&first_anchor]
(search by Anchor name)aliases[.^Simple]
(search for any elements starting with "Simple")aliases[.%string]
(search for any elements containing "string")aliases[.$value]
(search for any elements ending with "value")aliases[.=~/^(\b[Ss][a-z]+\s){2}[a-z]+$/]
(search for any elements matching a complex Regular Expression, which happens to match the example)/aliases[0]
(same as 1 but in forward-slash notation)/aliases/0
(same as 2 but in forward-slash notation)/aliases[&first_anchor]
(same as 3 but in forward-slash notation)
---
users:
- name: User One
password: ENC[PKCS7,MIIBiQY...Jk==]
roles:
- Writers
- name: User Two
password: ENC[PKCS7,MIIBiQY...vF==]
roles:
- Power Users
- Editors
With an example like this, YAML Path enables:
- selection of single nodes:
/users/0/roles/0
=Writers
- all children nodes of any given parent:
/users/1/roles
=["Power Users", "Editors"]
- searching by a child attribute:
/users[name="User One"]/password
=Some decrypted value, provided you have the appropriate EYAML keys
- pass-through selections against arrays-of-hashes:
/users/roles
=["Writers"]\n["Power Users", "Editors"]
(each user's list of roles are a seperate result) - collection of disparate results:
(/users/name)
=["User One", "User Two"]
(all names appear in a single result instead of one per line)
For a deeper exploration of YAML Path's capabilities, please visit the project Wiki.
Supported YAML Path Segments
A YAML Path segment is the text between seperators which identifies zero or
more parent or leaf nodes within the data structure. For dot-notation, a path
like hash.key
identifies two segments: hash
(a parent node) and key
(a
leaf node). The same path in forward-slash notation would be: /hash/key
.
YAML Path understands these segment types:
- Top-level Hash key selection:
key
- Explicit top-level array element selection:
[#]
where#
is the zero-based element number;#
can also be negative, causing the element to be selected from the end of the Array - Implicit array element selection or numbered hash key selection:
#
where#
is the 0-based element number or exact name of a hash key which is itself a number - Top-level (Hash) Anchor lookups:
&anchor_name
(the&
is required to indicate you are seeking an Anchor by name) - Hash sub-keys:
hash.child.key
or/hash/child/key
- Demarcation for dotted Hash keys:
hash.'dotted.child.key'
orhash."dotted.child.key"
(not necessary when using forward-slash notation,/hash/dotted.child.key
) - Named Array element selection:
array[#]
,array.#
,/array[#]
, or/array/#
wherearray
is the name of the Hash key containing Array data and#
is the 0-based element number - Anchor lookups in named Arrays:
array[&anchor_name]
wherearray
is the name of the Hash key containing Array data and both of the[]
pair and&
are required to indicate you are seeking an Anchor by name within an Array - Array slicing:
array[start#:stop#]
wherestart#
is the first inclusive, zero-based element andstop#
is the last exclusive element to select; either or both can be negative, causing the elements to be selected from the end of the Array; whenstart#
andstop#
are identical, it is the same asarray[start#]
- Hash slicing:
hash[min:max]
wheremin
andmax
are alphanumeric terms between which the Hash's keys are compared - Escape symbol recognition:
hash.dotted\.child\.key
,/hash/whacked\/child\/key
, andkeys_with_\\slashes
- Hash attribute searches (which can return zero or more matches):
- Exact match:
hash[name=admin]
- Starts With match:
hash[name^adm]
- Ends With match:
hash[name$min]
- Contains match:
hash[name%dmi]
- Less Than match:
hash[access_level<500]
- Greater Than match:
hash[access_level>0]
- Less Than or Equal match:
hash[access_level<=100]
- Greater Than or Equal match:
hash[access_level>=0]
- Regular Expression matches:
hash[access_level=~/^\D+$/]
(the/
Regular Expression delimiter can be substituted for any character you need, except white-space; note that/
does not interfere with forward-slash notation and it does not need to be escaped because the entire search expression is contained within a[]
pair) - Invert any match with
!
, like:hash[name!=admin]
or evenhash[!name=admin]
(the former syntax is used when YAML Paths are stringified but both forms are equivalent) - Demarcate and/or escape expression operands, like:
hash[full\ name="Some User\'s Name"]
(note that embedded, single'
and"
must be escaped lest they be deemed unmatched demarcation pairings) - Multi-level matching:
hash[name%admin].pass[encrypted!^ENC\[]
or/hash[name%admin]/pass[encrypted!^ENC\[]
- Exact match:
- Array element searches with all of the search methods above via
.
(yields any matching elements):array[.>9000]
- Hash key-name searches with all of the search methods above via
.
(yields their values, not the keys themselves):hash[.^app_]
- Array-of-Hashes Pass-Through Selection: Omit a selector for the elements of
an Array-of-Hashes and all matching Hash attributes at that level will be
yielded (or searched when there is more to the path). For example,
warriors[1].power_level
or/warriors[1]/power_level
will return the power_level attribute of only the second Hash in an Array-of-Hashes whilewarriors.power_level
or/warriors/power_level
will return the power_level attribute of every Hash in the same Array-of-Hashes. Of course these results can be filtered in multiple ways, likewarriors[power_level>9000]
,/warriors[power_level>9000]
,warriors.power_level[.>9000]
, and/warriors/power_level[.>9000]
all yield only the power_level from all warriors with power_levels over 9,000 within the same array of warrior hashes. - Wildcard Searches: The
*
symbol can be used as shorthand for the[]
search operator against text keys and values:/warriors/name/Go*
- Deep Traversals: The
**
symbol pair deeply traverses the document:- When it is the last or only segment of a YAML Path, it selects every leaf
node from the remainder of the document's tree:
/shows/**
- When another segment follows, it matches every node within the remainder
of the document's tree for which the following (and subsequent) segments
match:
/shows/**/name/Star*
- When it is the last or only segment of a YAML Path, it selects every leaf
node from the remainder of the document's tree:
- Collectors: Placing any portion of the YAML Path within parenthesis defines a
virtual list collector, like
(YAML Path)
; concatenation and exclusion operators are supported --+
and-
, respectively -- along with nesting, like(...)-((...)+(...))
- Complex combinations:
some::deep.hierarchy[with!=""].'any.valid'[.=~/(yaml|json)/][data%structure].or.complexity[4].2
or/some::deep/hierarchy[with!=""]/any.valid[.=~/(yaml|json)/][data%structure]/or/complexity[4]/2
This implementation of YAML Path encourages creativity. Use whichever notation and segment types that make the most sense to you in each application.
The project Wiki provides more illustrative details of YAML Path Segments.
Installing
This project requires Python 3. It is tested against Pythons 3.6 through 3.8. Most operating systems and distributions have access to Python 3 even if only Python 2 -- or no Python, at all -- came pre-installed. It is generally safe to have more than one version of Python on your system at the same time, especially when using virtual Python environments.
Each published version of this project can be installed from
PyPI using pip
. Note that on systems with more than one
version of Python, you will probably need to use pip3
, or equivalent (e.g.:
Cygwin users may need to use pip3.6
).
pip3 install yamlpath
EYAML support is entirely optional. You do not need EYAML to use YAML Path.
That YAML Path supports EYAML is a service to a substantial audience: Puppet
users. At the time of this writing, EYAML (classified as a Hiera
back-end/plug-in) is available only as a Ruby Gem. That said, it provides a
command-line tool, eyaml
, which can be employed by this otherwise Python
project. To enjoy EYAML support, install compatible versions of ruby and
rubygems, then execute:
gem install hiera-eyaml
If this puts the eyaml
command on your system PATH
, nothing more need be
done apart from generating or obtaining your encryption keys. Otherwise, you
can tell YAML Path library and tools where to find the eyaml
command.
Based on ruamel.yaml and Python 3
In order to support the best available YAML editing capability (so called, round-trip editing with support for comment preservation), this project is based on ruamel.yaml for Python 3. While ruamel.yaml is based on PyYAML -- Python's "standard" YAML library -- ruamel.yaml is objectively better than PyYAML, which lacks critical round-trip editing capabilities as well as up-to-date YAML/Compatible data parsing capabilities (at the time of this writing).
Should PyYAML ever merge with -- or at least, catch up with -- ruamel.yaml, this project can be (lightly) adapted to depend on it, instead. These conversations may offer some insight into when or whether this might happen:
The Files of This Project
This repository contains:
- Generally-useful Python library files. These contain the reusable core of this project's YAML Path capabilities.
- Some implementations of those libraries, exhibiting their capabilities and simple-to-use APIs as command-line tools.
- Various support, documentation, and build files.
Command-Line Tools
This project provides some command-line tool implementations which utilize YAML Path. For some use-case examples of these tools, see below.
The supplied command-line tools include:
usage: eyaml-rotate-keys [-h] [-V] [-d | -v | -q] [-b] [-x EYAML]
-i OLDPRIVATEKEY -c OLDPUBLICKEY
-r NEWPRIVATEKEY -u NEWPUBLICKEY
YAML_FILE [YAML_FILE ...]
Rotates the encryption keys used for all EYAML values within a set of YAML
files, decrypting with old keys and re-encrypting using replacement keys.
positional arguments:
YAML_FILE one or more YAML files containing EYAML values
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-d, --debug output debugging details
-v, --verbose increase output verbosity
-q, --quiet suppress all output except errors
-b, --backup save a backup of each modified YAML_FILE with an extra
.bak file-extension
-x EYAML, --eyaml EYAML
the eyaml binary to use when it isn't on the PATH
EYAML_KEYS:
All key arguments are required
-r NEWPRIVATEKEY, --newprivatekey NEWPRIVATEKEY
the new EYAML private key
-u NEWPUBLICKEY, --newpublickey NEWPUBLICKEY
the new EYAML public key
-i OLDPRIVATEKEY, --oldprivatekey OLDPRIVATEKEY
the old EYAML private key
-c OLDPUBLICKEY, --oldpublickey OLDPUBLICKEY
the old EYAML public key
Any YAML_FILEs lacking EYAML values will not be modified (or backed up, even
when -b/--backup is specified).
usage: yaml-get [-h] [-V] -p YAML_PATH
[-t ['.', '/', 'auto', 'dot', 'fslash']] [-S] [-x EYAML]
[-r PRIVATEKEY] [-u PUBLICKEY] [-d | -v | -q]
[YAML_FILE]
Retrieves one or more values from a YAML/JSON/Compatible file at a specified
YAML Path. Output is printed to STDOUT, one line per result. When a result is
a complex data-type (Array or Hash), a JSON dump is produced to represent it.
EYAML can be employed to decrypt the values.
positional arguments:
YAML_FILE the YAML file to query; omit or use - to read from
STDIN
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-t ['.', '/', 'auto', 'dot', 'fslash'], --pathsep ['.', '/', 'auto', 'dot', 'fslash']
indicate which YAML Path seperator to use when
rendering results; default=dot
-S, --nostdin Do not implicitly read from STDIN, even when YAML_FILE
is not set and the session is non-TTY
-d, --debug output debugging details
-v, --verbose increase output verbosity
-q, --quiet suppress all output except errors
required settings:
-p YAML_PATH, --query YAML_PATH
YAML Path to query
EYAML options:
Left unset, the EYAML keys will default to your system or user defaults.
Both keys must be set either here or in your system or user EYAML
configuration file when using EYAML.
-x EYAML, --eyaml EYAML
the eyaml binary to use when it isn't on the PATH
-r PRIVATEKEY, --privatekey PRIVATEKEY
EYAML private key
-u PUBLICKEY, --publickey PUBLICKEY
EYAML public key
For more information about YAML Paths, please visit
https://github.com/wwkimball/yamlpath.
usage: yaml-merge [-h] [-V] [-c CONFIG] [-a {stop,left,right,rename}]
[-A {all,left,right,unique}] [-H {deep,left,right}]
[-O {all,deep,left,right,unique}] [-m YAML_PATH]
[-o OUTPUT | -w OVERWRITE] [-b] [-D {auto,json,yaml}] [-S]
[-d | -v | -q]
[YAML_FILE [YAML_FILE ...]]
Merges two or more YAML/JSON/Compatible files together.
positional arguments:
YAML_FILE one or more YAML files to merge, order-significant;
omit or use - to read from STDIN
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-c CONFIG, --config CONFIG
INI syle configuration file for YAML Path specified
merge control options
-a {stop,left,right,rename}, --anchors {stop,left,right,rename}
means by which Anchor name conflicts are resolved
(overrides [defaults]anchors set via --config|-c and
cannot be overridden by [rules] because Anchors apply
to the whole file); default=stop
-A {all,left,right,unique}, --arrays {all,left,right,unique}
default means by which Arrays are merged together
(overrides [defaults]arrays but is overridden on a
YAML Path basis via --config|-c); default=all
-H {deep,left,right}, --hashes {deep,left,right}
default means by which Hashes are merged together
(overrides [defaults]hashes but is overridden on a
YAML Path basis in [rules] set via --config|-c);
default=deep
-O {all,deep,left,right,unique}, --aoh {all,deep,left,right,unique}
default means by which Arrays-of-Hashes are merged
together (overrides [defaults]aoh but is overridden on
a YAML Path basis in [rules] set via --config|-c);
default=all
-m YAML_PATH, --mergeat YAML_PATH
YAML Path indicating where in left YAML_FILE the right
YAML_FILE content is to be merged; default=/
-o OUTPUT, --output OUTPUT
Write the merged result to the indicated nonexistent
file
-w OVERWRITE, --overwrite OVERWRITE
Write the merged result to the indicated file; will
replace the file when it already exists
-b, --backup save a backup OVERWRITE file with an extra .bak
file-extension; applies only to OVERWRITE
-D {auto,json,yaml}, --document-format {auto,json,yaml}
Force the merged result to be presented in one of the
supported formats or let it automatically match the
known file-name extension of OUTPUT|OVERWRITE (when
provided), or match the type of the first document;
default=auto
-S, --nostdin Do not implicitly read from STDIN, even when there are
no - pseudo-files in YAML_FILEs with a non-TTY session
-d, --debug output debugging details
-v, --verbose increase output verbosity
-q, --quiet suppress all output except errors (implied when
-o|--output is not set)
The CONFIG file is an INI file with up to three sections:
[defaults] Sets equivalents of -a|--anchors, -A|--arrays,
-H|--hashes, and -O|--aoh.
[rules] Each entry is a YAML Path assigning -A|--arrays,
-H|--hashes, or -O|--aoh for precise nodes.
[keys] Wherever -O|--aoh=DEEP, each entry is treated as a
record with an identity key. In order to match RHS
records to LHS records, a key must be known and is
identified on a YAML Path basis via this section.
Where not specified, the first attribute of the first
record in the Array-of-Hashes is presumed the identity
key for all records in the set.
The left-to-right order of YAML_FILEs is significant. Except
when this behavior is deliberately altered by your options, data
from files on the right overrides data in files to their left.
Only one input file may be the - pseudo-file (read from STDIN).
When no YAML_FILEs are provided, - will be inferred as long as you
are running this program without a TTY (unless you set
--nostdin|-S). Any file, including input from STDIN, may be a
multi-document YAML or JSON file.
For more information about YAML Paths, please visit
https://github.com/wwkimball/yamlpath.
usage: yaml-paths [-h] [-V] -s EXPRESSION [-c EXPRESSION] [-m] [-L] [-F] [-X]
[-P] [-t ['.', '/', 'auto', 'dot', 'fslash']] [-i | -k | -K]
[-a] [-A | -Y | -y | -l] [-e] [-x EYAML] [-r PRIVATEKEY]
[-u PUBLICKEY] [-S] [-d | -v | -q]
[YAML_FILE [YAML_FILE ...]]
Returns zero or more YAML Paths indicating where in given YAML/JSON/Compatible
data one or more search expressions match. Values, keys, and/or anchors can be
searched. EYAML can be employed to search encrypted values.
positional arguments:
YAML_FILE one or more YAML files to search; omit or use - to
read from STDIN
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-c EXPRESSION, --except EXPRESSION
except results matching this search expression; can be
set more than once
-m, --expand expand matching parent nodes to list all permissible
child leaf nodes (see "reference handling options" for
restrictions)
-t ['.', '/', 'auto', 'dot', 'fslash'], --pathsep ['.', '/', 'auto', 'dot', 'fslash']
indicate which YAML Path seperator to use when
rendering results; default=dot
-a, --refnames also search the names of &anchor and *alias references
-S, --nostdin Do not implicitly read from STDIN, even when there are
no - pseudo-files in YAML_FILEs with a non-TTY session
-d, --debug output debugging details
-v, --verbose increase output verbosity
-q, --quiet suppress all non-result output except errors
required settings:
-s EXPRESSION, --search EXPRESSION
the search expression; can be set more than once
result printing options:
-L, --values print the values or elements along with each YAML Path
(complex results are emitted as JSON; use --expand to
emit only simple values)
-F, --nofile omit source file path and name decorators from the
output (applies only when searching multiple files)
-X, --noexpression omit search expression decorators from the output
-P, --noyamlpath omit YAML Paths from the output (useful with --values
or to indicate whether a file has any matches without
printing them all, perhaps especially with
--noexpression)
key name searching options:
-i, --ignorekeynames (default) do not search key names
-k, --keynames search key names in addition to values and array
elements
-K, --onlykeynames only search key names (ignore all values and array
elements)
reference handling options:
Indicate how to treat anchor and alias references. An anchor is an
original, reusable key or value. All aliases become replaced by the
anchors they reference when YAML data is read. These options specify how
to handle this duplication of keys and values. Note that the default
behavior includes all aliased keys but not aliased values.
-A, --anchorsonly include only original matching key and value anchors
in results, discarding all aliased keys and values
(including child nodes)
-Y, --allowkeyaliases
(default) include matching key aliases, permitting
search traversal into their child nodes
-y, --allowvaluealiases
include matching value aliases (does not permit search
traversal into aliased keys)
-l, --allowaliases include all matching key and value aliases
EYAML options:
Left unset, the EYAML keys will default to your system or user defaults.
Both keys must be set either here or in your system or user EYAML
configuration file when using EYAML.
-e, --decrypt decrypt EYAML values in order to search them
(otherwise, search the encrypted blob)
-x EYAML, --eyaml EYAML
the eyaml binary to use when it isn't on the PATH
-r PRIVATEKEY, --privatekey PRIVATEKEY
EYAML private key
-u PUBLICKEY, --publickey PUBLICKEY
EYAML public key
A search or exception EXPRESSION takes the form of a YAML Path search operator
-- %, $, =, ^, >, <, >=, <=, =~, or ! -- followed by the search term, omitting
the left-hand operand. For more information about YAML Paths, please visit
https://github.com/wwkimball/yamlpath.
usage: yaml-set [-h] [-V] -g YAML_PATH [-a VALUE | -f FILE | -i | -R LENGTH]
[-F {bare,boolean,default,dquote,float,folded,int,literal,squote}]
[-c CHECK] [-s YAML_PATH] [-m] [-b]
[-t ['.', '/', 'auto', 'dot', 'fslash']] [-M CHARS] [-e]
[-x EYAML] [-r PRIVATEKEY] [-u PUBLICKEY] [-S] [-d | -v | -q]
[YAML_FILE]
Changes one or more Scalar values in a YAML/JSON/Compatible document at a
specified YAML Path. Matched values can be checked before they are replaced to
mitigate accidental change. When matching singular results, the value can be
archived to another key before it is replaced. Further, EYAML can be employed
to encrypt the new values and/or decrypt an old value before checking it.
positional arguments:
YAML_FILE the YAML file to update; omit or use - to read from
STDIN
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-F {bare,boolean,default,dquote,float,folded,int,literal,squote}, --format {bare,boolean,default,dquote,float,folded,int,literal,squote}
override automatic formatting of the new value
-c CHECK, --check CHECK
check the value before replacing it
-s YAML_PATH, --saveto YAML_PATH
save the old value to YAML_PATH before replacing it;
implies --mustexist
-m, --mustexist require that the --change YAML_PATH already exist in
YAML_FILE
-b, --backup save a backup YAML_FILE with an extra .bak file-
extension
-t ['.', '/', 'auto', 'dot', 'fslash'], --pathsep ['.', '/', 'auto', 'dot', 'fslash']
indicate which YAML Path seperator to use when
rendering results; default=dot
-M CHARS, --random-from CHARS
characters from which to build a value for --random;
default=all upper- and lower-case letters and all
digits
-S, --nostdin Do not implicitly read from STDIN, even when there is
no YAML_FILE with a non-TTY session
-d, --debug output debugging details
-v, --verbose increase output verbosity
-q, --quiet suppress all output except errors
required settings:
-g YAML_PATH, --change YAML_PATH
YAML Path where the target value is found
input options:
-a VALUE, --value VALUE
set the new value from the command-line instead of
STDIN
-f FILE, --file FILE read the new value from file (discarding any trailing
new-lines)
-i, --stdin accept the new value from STDIN (best for sensitive
data)
-R LENGTH, --random LENGTH
randomly generate a replacement value of a set length
EYAML options:
Left unset, the EYAML keys will default to your system or user defaults.
You do not need to supply a private key unless you enable --check and the
old value is encrypted.
-e, --eyamlcrypt encrypt the new value using EYAML
-x EYAML, --eyaml EYAML
the eyaml binary to use when it isn't on the PATH
-r PRIVATEKEY, --privatekey PRIVATEKEY
EYAML private key
-u PUBLICKEY, --publickey PUBLICKEY
EYAML public key
When no changes are made, no backup is created, even when -b/--backup is
specified. For more information about YAML Paths, please visit
https://github.com/wwkimball/yamlpath.
Libraries
While there are several supporting library files like enumerations, types, and exceptions, the most interesting library files include:
- yamlpath.py -- The core YAML Path parser logic.
- processor.py -- Processes YAMLPath instances to read or write data to YAML/Compatible sources.
- eyamlprocessor.py -- Extends the Processor class to support EYAML data encryption and decryption.
- merger.py -- The core document merging logic.
Basic Usage
The files of this project can be used either as command-line tools or as libraries to supplement your own work.
Basic Usage: Command-Line Tools
The command-line tools are self-documented and their documentation is captured
above for easy reference. Simply pass --help
to them in
order to obtain the same detailed documentation.
Please review the comprehensive test_commands_*.py unit tests to explore samples of YAML files and the many ways these tools help get and set their data.
The following are some simple examples of their typical use-cases.
Rotate Your EYAML Keys
If the eyaml command is already on your PATH (if not, be sure to also supply
the optional --eyaml
or -x
argument):
eyaml-rotate-keys \
--oldprivatekey=~/old-keys/private_key.pkcs7.pem \
--oldpublickey=~/old-keys/public_key.pkcs7.pem \
--newprivatekey=~/new-keys/private_key.pkcs7.pem \
--newpublickey=~/new-keys/public_key.pkcs7.pem \
my_1st_yaml_file.yaml my_2nd_yaml_file.eyaml ... my_Nth_yaml_file.yaml
You could combine this with find
and xargs
if your E/YAML file are
dispersed through a directory hierarchy, as with Hiera data.
EYAML Compatibility Alert
The maintainers of the hiera-eyaml project have released version 3.x and it is not backward compatible with encryption certificates generated for hiera-eyaml version 2.x. This has nothing to do with YAML Path and is alerted here only as a courtesy to YAML Path users. If you upgrade your installation of hiera-eyaml without first updating your encryption certificates and using a tool like eyaml-rotate-keys (provided here) to re-encrypt your data with the replacement certificates, hiera-eyaml 3.x will fail to decrypt your data! This is not a problem with YAML Path. hiera-eyaml certificate compatibility is well outside the purview of YAML Path and its tools.
Get a YAML Value
At its simplest:
yaml-get \
--query=see.documentation.above.for.many.samples \
my_yaml_file.yaml
Search For YAML Paths
Simplest use:
yaml-paths \
--search=%word \
/some/directory/*.yaml
Search for multiple expressions and exclude unwanted results:
yaml-paths \
--search=^another \
--search=$word \
--except=%bad \
/some/directory/*.yaml
Return all leaf nodes under matching parents (most useful when matching against Hash keys and you only want the original leaf nodes beneath them):
yaml-paths \
--expand \
--keynames \
--search==parent_node \
/some/directory/*.yaml
Change a YAML Value
For a no-frills change to a YAML file with deeply nested Hash structures:
yaml-set \
--change=see.documentation.above.for.many.samples \
--value="New Value" \
my_yaml_file.yaml
To rotate a password, preserving the old password perhaps so your automation can apply the new password to your application(s):
yaml-set \
--mustexist \
--change=the.new.password \
--saveto=the.old.password \
--value="New Password" \
my_yaml_file.yaml
For the extremely cautious, you could check the old password before rotating it and save a backup of the original file:
yaml-set \
--mustexist \
--change=the.new.password \
--saveto=the.old.password \
--check="Old Password" \
--value="New Password" \
--backup \
my_yaml_file.yaml
You can also add EYAML encryption (assuming the eyaml
command is on your
PATH; if not, you can pass --eyaml
to specify its location). In this example,
I add the optional --format=folded
so that the long EYAML value is broken up
into a multi-line value rather than one very long string. This is the preferred
format for human legibility as well as EYAML consumers like
Puppet. Note that --format
has several other settings
and applies only to new values.
yaml-set \
--change=the.new.password \
--mustexist \
--saveto=the.old.password \
--check="Old Password" \
--value="New Password" \
--eyamlcrypt \
--format=folded \
--backup \
my_yaml_file.yaml
You can even tell EYAML which keys to use, if not your default system or user keys:
yaml-set \
--change=the.new.password \
--mustexist \
--saveto=the.old.password \
--check="Old Password" \
--value="New Password" \
--eyamlcrypt \
--format=folded \
--privatekey=/secret/keys/private_key.pkcs7.pem \
--publickey=/secret/keys/public_key.pkcs7.pem \
--backup \
my_yaml_file.yaml
Note that for even greater security scenarios, you can keep the new value off of
your command-line, process list, and command history by swapping out --value
for one of --stdin
, --file
, or even --random LENGTH
(use Python's
strongest random value generator if you don't need to specify the replacement
value in advance).
Merge YAML/Compatible Files
At its simplest, the yaml-merge
command accepts two or more input files and
merges them together from left-to-right, writing the result to STDOUT:
yaml-merge leftmost.yaml middle.yaml right.json
If you'd rather write the results to a new output file (which must not already exist):
yaml-merge \
--output=newfile.yaml \
leftmost.yaml \
middle.yaml \
right.json
Should you wish to merge the content of the files into a specific location (or
even multiple locations) within the leftmost document, specify a YAML Path via
the --mergeat
or -m
argument:
yaml-merge \
--mergeat=/anywhere/within/the/document \
leftmost.yaml \
middle.yaml \
right.json
To write arbitrary data from STDIN into a document, use the -
pseudo-file:
echo "{arbitrary: [document, structure]}" | yaml-merge target.yaml -
Combine --mergeat
or -m
with the STDIN pseudo-file to control where the
data is to be written:
echo "{arbitrary: [document, structure]}" | \
yaml-merge \
--mergeat=/anywhere/within/the/document \
target.yaml -
There are many options for precisely controlling how the merge is performed,
including the ability to specify complex rules on a YAML Path basis via a
configuration file. Review the command's --help
or the
related Wiki for
more detail.
Basic Usage: Libraries
As for the libraries, they are also heavily documented and the example implementations may perhaps serve as good copy-paste fodder (provided you give credit to the source). That said, here's a general flow/synopsis.
Initialize ruamel.yaml and These Helpers
Your preferences may differ, but I use this setup for round-trip YAML parsing
and editing with ruamel.yaml. When you need to process EYAML encrypted data,
replace yamlpath.Processor
with yamlpath.eyaml.EYAMLProcessor
and add error
handling for yamlpath.eyaml.EYAMLCommandException
.
Note that import yamlpath.patches
is entirely optional. I wrote and use it to
block ruamel.yaml's Emitter from injecting unnecessary newlines into folded
values (it improperly converts every single new-line into two for left-flushed
multi-line values, at the time of this writing). Since "block" output EYAML
values are left-flushed multi-line folded strings, this fix is necessary when
using EYAML features. At least, until ruamel.yaml has its own fix for this
issue.
Note also that these examples use ConsolePrinter
to handle STDOUT and STDERR
messaging. You don't have to. However, some kind of logger must be passed to
these libraries so they can write messages somewhere. Your custom message
handler or logger must provide the same API as ConsolePrinter
; review the
header documentation in consoleprinter.py
for details. Generally speaking, it would be trivial to write your own custom
wrapper for Python's standard logging facilities if you require targets other
than STDOUT and STDERR.
import sys
from ruamel.yaml import YAML
from ruamel.yaml.parser import ParserError
import yamlpath.patches
from yamlpath.func import get_yaml_data, get_yaml_editor
from yamlpath.wrappers import ConsolePrinter
from yamlpath import Processor
# Process command-line arguments and initialize the output writer
args = processcli()
log = ConsolePrinter(args)
# Prep the YAML parser and round-trip editor (tweak to your needs)
yaml = get_yaml_editor()
# At this point, you'd load or parse your YAML file, stream, or string. When
# loading from file, I typically follow this pattern:
yaml_data = get_yaml_data(yaml, log, yaml_file)
if yaml_data is None:
# There was an issue loading the file; an error message has already been
# printed.
exit(1)
# Pass the log writer and parsed YAML data to the YAMLPath processor
processor = Processor(log, yaml_data)
# At this point, the processor is ready to handle YAMLPaths
Searching for YAML Nodes
These libraries use Generators to get
nodes from parsed YAML data. Identify which node(s) to get via YAML Path
strings. You should also catch yamlpath.exceptions.YAMLPathException
s
unless you prefer Python's native stack traces. When using EYAML, you should
also catch yamlpath.eyaml.exceptions.EYAMLCommandException
s for the same
reason. Whether you are working with a single result or many, you should
consume the Generator output with a pattern similar to:
from yamlpath import YAMLPath
from yamlpath.exceptions import YAMLPathException
yaml_path = YAMLPath("see.documentation.above.for.many.samples")
try:
for node_coordinate in processor.get_nodes(yaml_path):
log.debug("Got {} from '{}'.".format(node_coordinate, yaml_path))
# Do something with each node_coordinate.node (the actual data)
except YAMLPathException as ex:
# If merely retrieving data, this exception may be deemed non-critical
# unless your later code absolutely depends upon a result.
log.error(ex)
Changing Values
At its simplest, you only need to supply the the YAML Path to one or more nodes
to update, and the value to apply to them. Catching
yamlpath.exceptions.YAMLPathException
is optional but usually preferred over
allowing Python to dump the call stack in front of your users. When using
EYAML, the same applies to yamlpath.eyaml.exceptions.EYAMLCommandException
.
from yamlpath.exceptions import YAMLPathException
try:
processor.set_value(yaml_path, new_value)
except YAMLPathException as ex:
log.critical(ex, 119)
except EYAMLCommandException as ex:
log.critical(ex, 120)
Merging Documents
A document merge naturally requires at least two documents. At the code-level,
this means two populated DOM objects (populated instances of yaml_data
from
above). You do not need to use a Processor
for merging. In the least amount
of code, a merge looks like:
from yamlpath.exceptions import YAMLPathException
from yamlpath.merger.exceptions import MergeException
from yamlpath.merger import Merger, MergerConfig
# Obtain or build the lhs_data and rhs_data objects using get_yaml_data or
# equivalent.
# You'll still need to supply a logger and some arguments used by the merge
# engine. For purely default behavior, you could create args as a bare
# SimpleNamespace. Initialize the new Merger instance with the LHS document.
merger = Merger(log, lhs_data, MergerConfig(log, args))
# Merge RHS into LHS
try:
merger.merge_with(rhs_data)
except MergeException as mex:
log.critical(mex, 129)
except YAMLPathException as yex:
log.critical(yex, 130)
# At this point, merger.data is the merged result; do what you will with it,
# including merging more data into it. When you are ready to dump (write)
# out the merged data, you must prepare the document and your
# ruamel.yaml.YAML instance -- usually obtained from func.get_yaml_editor()
# -- like this:
merger.prepare_for_dump(my_yaml_editor)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.