Skip to main content

A simple utility to define directory structures in YAML and populate them on disk in a single command.

Project description

yaml_to_disk

PyPI - Version python codecov tests code-quality license PRs contributors

A simple tool to let you define a directory structure in yaml form, then populate it on disk in a single command. Highly useful for simplifying test case setup, espically in doctest settings where readability is critical.

1. Installation

pip install yaml_to_disk

If you plan on using the Parquet functionality, install the optional pyarrow dependency with:

pip install "yaml_to_disk[parquet]"

If you want to work with TOML files, install the optional tomli-w dependency:

pip install "yaml_to_disk[toml]"

2. Usage

To use, you simply define a yaml representation of the files you want to populate, then call the function. E.g.,

>>> from yaml_to_disk import yaml_disk
>>> target_contents = '''
... dir1:
...   "sub1.txt/":
...     file1.txt: "Hello, World!"
...   sub2:
...     cfg.yaml: {"foo": "bar"}
...     data.csv: |-2
...       a,b,c
...       1,2,3
... a.json:
...   - key1: value1
...     key2: value2
...   - str_element
... '''
>>> with yaml_disk(target_contents) as root_path:
...     print_directory(root_path)
...     print("---------------------")
...     print(f"file1.txt contents: {(root_path / 'dir1' / 'sub1.txt' / 'file1.txt').read_text()}")
...     print(f"a.json contents: {(root_path / 'a.json').read_text()}")
...     print("cfg.yaml contents:")
...     print((root_path / 'dir1' / 'sub2' / 'cfg.yaml').read_text().strip())
...     print("data.csv contents:")
...     print((root_path / 'dir1' / 'sub2' / 'data.csv').read_text().strip())
├── a.json
└── dir1
    ├── sub1.txt
       └── file1.txt
    └── sub2
        ├── cfg.yaml
        └── data.csv
---------------------
file1.txt contents: Hello, World!
a.json contents: [{"key1": "value1", "key2": "value2"}, "str_element"]
cfg.yaml contents:
foo: bar
data.csv contents:
a,b,c
1,2,3

You can also pass a filepath that contains the target yaml on disk, or a parsed view of the yaml contents (e.g., as a dictionary or a list):

>>> with tempfile.TemporaryDirectory() as temp_dir:
...     yaml_path = Path(temp_dir) / "target.yaml"
...     _ = yaml_path.write_text(target_contents)
...     with yaml_disk(yaml_path) as root_path:
...         print_directory(root_path)
├── a.json
└── dir1
    ├── sub1.txt
       └── file1.txt
    └── sub2
        ├── cfg.yaml
        └── data.csv
>>> as_list = ["foo.png"] # Note that this will only make an empty file with this name
>>> with yaml_disk(as_list) as root_path:
...     print_directory(root_path)
└── foo.png
>>> as_dict = {"foo.pkl": {"bar": "baz"}}
>>> import pickle
>>> with yaml_disk(as_dict) as root_path:
...     print_directory(root_path)
...     print("----------------------")
...     with open(root_path / "foo.pkl", "rb") as f:
...         print(f"foo.pkl contents: {pickle.load(f)}")
└── foo.pkl
----------------------
foo.pkl contents: {'bar': 'baz'}

YAML Syntax

The YAML syntax specifies a list or ordered dictionaries of nested files and directories. In list form, a plain string list entry is either a file name (if it does not end in /) or a directory name (if it does end in /), and the file (or directory) will be created at the requisite location. If the entry is a dictionary, it must have a single key, which is the file (or directory) name, and the value is either the file contents (in various representations) or the nested directory contents. In this syntax, directories are not required to end in /, as file contents can only be added to files with extensions so that the package knows how to format them.

DIR_NAME:
  SUB_DIR_NAME:
    - FILE_NAME.EXT: FILE_CONTENT
    - FILE_NAME.EXT # No contents, just an empty file
    - SUB_DIR_NAME/ # No contents, just an empty directory
  SUB_DIR_NAME:
    FILE_NAME.EXT: FILE_CONTENT # Can also use a dictionary representation rather than a list if suitable

Supported Extensions:

Extension Description Accepts? Write Method
txt Plain text file Plain strings Written as is
json JSON file Any JSON compatible object Written via json.dump
yaml,yml YAML file Any YAML compatible object Written via yaml.dump
pkl Pickle file Any pickle serializable Written via pickle.dump
csv CSV file CSV data in either string, column-map, or a list of rows format See CSVFile for details
tsv TSV file TSV data in same formats as CSV See TSVFile for details
parquet Parquet file pyarrow.Table, column-map, or list of row maps Written via pyarrow.parquet.write_table
toml TOML file Any TOML compatible object Written via tomli_w.dumps

Other extensions can be used, but only in the empty files mode.

Adding new extensions:

You can easily add your own file extensions to be supported in your custom python packages by simply subclassing the FileType abstract base class and implementing the necessary methods and class variables. Then, you can register it as a supported extension by adding an entry point to your pyproject.toml file, like this:

[project.entry-points."yaml_to_disk.file_types"]
txt = "yaml_to_disk.file_types.txt:TextFile"
json = "yaml_to_disk.file_types.json:JSONFile"
pkl = "yaml_to_disk.file_types.pkl:PickleFile"
yaml = "yaml_to_disk.file_types.yaml:YAMLFile"
csv = "yaml_to_disk.file_types.csv:CSVFile"
tsv = "yaml_to_disk.file_types.tsv:TSVFile"
parquet = "yaml_to_disk.file_types.parquet:ParquetFile"
toml = "yaml_to_disk.file_types.toml:TOMLFile"

Then, the system will automatically know how to match and use your new file type. Note that you cannot overwrite existing file extensions in this way; instead, if an overwrite is attempted, upon the load of all registered file types, an error will be raised.

Note that you can set all non-recognized extensions with string values to be treadted as .txt files via the class variable YamlDisk._USE_TXT_ON_UNK_STR_FILES, which is True by default or by passing the keyword argument use_txt_on_unk_str_files to the yaml_disk function. For example:

>>> unk_file_contents_str = '''
... file1.txt: "Hello, World!"
... file2.md: "# Hello, World!"
... '''
>>> with yaml_disk(unk_file_contents_str) as root_path:
...     print_directory(root_path)
...     print("---------------------")
...     print(f"file1.txt contents: {(root_path / 'file1.txt').read_text()}")
...     print(f"file2.md contents: {(root_path / 'file2.md').read_text()}")
├── file1.txt
└── file2.md
---------------------
file1.txt contents: Hello, World!
file2.md contents: # Hello, World!
>>> with yaml_disk(unk_file_contents_str, use_txt_on_unk_str_files=False) as root_path:
...     pass # An error will be thrown
Traceback (most recent call last):
  ...
ValueError: No file type found for .md
>>> unk_file_contents_not_str = '''
... file1.txt: "Hello, World!"
... file2.pdf: [["a", "b", "c"], ["1", "2", "3"]]
... '''
>>> with yaml_disk(unk_file_contents_not_str, use_txt_on_unk_str_files=True) as root_path:
...     pass # An error will be thrown as the contents aren't string
Traceback (most recent call last):
  ...
ValueError: No file type found for .pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yaml_to_disk-0.0.5.tar.gz (96.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yaml_to_disk-0.0.5-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file yaml_to_disk-0.0.5.tar.gz.

File metadata

  • Download URL: yaml_to_disk-0.0.5.tar.gz
  • Upload date:
  • Size: 96.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yaml_to_disk-0.0.5.tar.gz
Algorithm Hash digest
SHA256 972d077fada66444210ada75e72ee789a640861d60b9368040080b91492f91fe
MD5 b9f89d021445283e47a44260a1c33996
BLAKE2b-256 31ec6010796bcc6a0294eae073fd14986ca37387d6e53cd4ceb1e54154018b8e

See more details on using hashes here.

Provenance

The following attestation bundles were made for yaml_to_disk-0.0.5.tar.gz:

Publisher: python-build.yaml on mmcdermott/yaml_to_disk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yaml_to_disk-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: yaml_to_disk-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yaml_to_disk-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2fac9794d5ff22dfef96707038017a3288eea2b1d490fd680cebd37373e972db
MD5 d47a8e1b667a5dc773e70a8dfb8f9c16
BLAKE2b-256 4f11be763a4f337032e2ec4b1b9bbf267d8b35f4e60a831b59b4ac5ca2ad30d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for yaml_to_disk-0.0.5-py3-none-any.whl:

Publisher: python-build.yaml on mmcdermott/yaml_to_disk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page