Python reader, writer and validator for Meta yAML (MAML).
Project description
pymaml
Official python package for reading, writing, and parsing the Meta yAML format.
(see also https://github.com/asgr/MAML)
MAML
MAML is a YAML based metadata format for tabular data (roughly implying Metadata yAML). This package is the official python interface to help read, write, and parse MAML files.
Why MAML?
We have VOTable and FITS header already?! Well, for various projects we were keen on a rich metadata format that was easy for humans and computers to both read and write. VOTable headers are very hard for humans to read and write, and FITS is very restrictive with its formatting and only useful for FITS files directly. In comes YAML, a very human and machine-readable and writable format. By restricting ourselves to a narrow subset of the language we can easily describe fairly complex table metadata (including all IVOA information). Introducing Meta yAML (MAML).
The MAML format files should be saved as example.maml etc. And the idea is the maml string can be inserted directly into a number of different file formats that accept key-value metadata (like Apache Arrow Parquet files). In the case of Parquet files they should be written to a 'maml' extension in the metadata section of the file.
MAML Metadata Format
The MAML metadata format is a structured way to describe datasets, surveys, and tables using YAML. This format ensures that all necessary information about the data is captured in a clear and organized manner.
Structure
The superset of allowed entries for MAML is below. Not all are required, but if present they should obey the order and naming.
- survey: The name of the survey. Scalar string. [optional]
- dataset: The name of the dataset. Scalar string. [recommended]
- table: The name of the table. Scalar string. [required]
- version: The version of the dataset. Scalar string, integer or float. [required]
- date: The date of the dataset in
YYYY-MM-DDformat. Scalar string. [required] - author: The lead author of the dataset, including their email. Scalar string. [required]
- coauthors: A list of co-authors, each with their email. Vector string. [optional]
- depends: A list of datasets that this dataset depends on. Vector string. [optional]
- description: A sentence or two describing the table. Scalar string. [recommended]
- comments: A list of comments or interesting facts about the data. Vector string. [optional]
- fields: A list of fields in the dataset, each with the following attributes: [required]
- name: The name of the field. Scalar string. [required]
- unit: The unit of measurement for the field (if applicable). Scalar string. [recommended]
- info: A short description of the field. Scalar string. [recommended]
- ucd: Unified Content Descriptor for IVOA (can have many). Vector string. [recommended]
- data_type: The data type of the field (e.g.,
int32,string,bool,double). Scalar string. [required] - array_size: Maximum length of character strings. Scalar integer or Scalar string. [optional]
This metadata format can be used to document datasets in a standardised way, making it easier to understand and share data within the research community. By following this format, you ensure that all relevant information about the dataset is captured and easily accessible.
This format contains the superset of metadata requirements for IVOA, Data Central and surveys like GAMA and WAVES.
If producing a maximal MAML then the metadata can be considered a MAML-Whale, and if only containing the required minimum entries it would be a MAML-Mouse. Between these two extremes you can choose your mammal of interest to reflect the quality/quantity of metadata. The sweet spot is obviously a MAML-Honey-Badger.
pymaml
Installation
pymaml can be installed easily with pip
pip install pymaml
Creating a new .maml file.
Reading in a .maml file.
Reading a maml file is easily done using the MAML object in pymaml. Reading it in this way will include validation "for free".
from pymaml import MAML
new_maml = MAML.from_file("example.maml")
This MAML object will only be created if all the the required fields are present in the maml file.
Validating a .maml file.
The pymaml package has a validate function that will audit a .maml file and return weather or not that file is valid as well as describe why it isnt valid and any warnigns that the users might wish to consider.
from pymaml import validate
validate("example.maml")
Creating a new maml file
The MAML object is the core object for building and writing maml formats and will do all validation. Using this method guarantees that the maml written is valid maml including ucd checking and date formats.
At the very least, a table name, author name, and at least one Field need to be passed:
from pymaml.maml import MAML, Field
new_maml = MAML(table="New table Name", Author="Me, myself, and I", fields = [Field(name='ra', data_type='float')])
The Field object is the main way to build new fields and will also force checks to make sure that the fields are valid.
For convience, a default maml construction can be built quickly with the class method .default()
from pymaml import MAML
default_maml = MAML.default()
Values can be updated in the normal way in python classes. Or, for convience, several setter methods are available to use including add_comment(), add_field(), and set_date()
from pymaml import MAML
maml = MAML.default()
maml.set_date("2025-01-02")
maml.add_field(Field(name="Declination", ucd="pos.eq.dec", data_type="float"))
maml.add_comment("This is an easy way to add a comment to the existing maml.")
Once the MAML object is built, then it can easily be written to file:
maml.to_file("new_maml.maml")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pymaml-0.4.0.tar.gz.
File metadata
- Download URL: pymaml-0.4.0.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a953f6bc7e7356f323672cd30984091e0b0ca7530421dc7eb479f924167082ff
|
|
| MD5 |
7efc887d2794c6180ed60b65bae9d18a
|
|
| BLAKE2b-256 |
8c39fd8866105e0afa0f8ed35119701a06fc27e0b61e4025a1e36c85e755af1c
|
File details
Details for the file pymaml-0.4.0-py3-none-any.whl.
File metadata
- Download URL: pymaml-0.4.0-py3-none-any.whl
- Upload date:
- Size: 8.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
372ba651a25bed52ebfb2a0d85bd2192630f65e1fd41d3f18517333cbf04aacb
|
|
| MD5 |
1886ca8194657c2ac45d39d8985eea3b
|
|
| BLAKE2b-256 |
a3ad8df659dab492d14bcb8fdf45cf9e1311120eea2e3d6894c80f16dd459eda
|