Skip to main content

Schema validation for Xarray objects

Project description

Tests codecov PyPI version shields.io PyPI license

Xarrera

Schema validation for Xarray


Explore the docs »

About the project

Xarrera is an open source project that provides an API for performing data-format validation on Xarray objects.

Getting Started

Prerequisites

  • Python >= 3.9

Installation

Install Xarrera from PyPI:

pip install xarrera

Conda:

conda install -c conda-forge xarrera

Or install it from source:

pip install git+https://github.com/javgat/xarrera

Another option is cloning the repository and installing the python package and its dependencies by using:

git clone https://github.com/javgat/xarrera.git
cd xarrera
pip install -e .

Usage

Xarrera's API is modeled after Pandera. The DataArraySchema and DatasetSchema objects both have .validate() methods.

The basic usage is as follows:

import numpy as np
import xarray as xr
from xarrera import DataArraySchema, DatasetSchema, CoordsSchema

da = xr.DataArray(np.ones(4, dtype='i4'), dims=['x'], name='foo')

schema = DataArraySchema(dtype=np.integer, name='foo', shape=(4, ), dims=['x'])

schema.validate(da)

You can also use it to validate a Dataset like so:

schema_ds = DatasetSchema({'foo': schema})

schema_ds.validate(da.to_dataset())

Each component of the Xarray data model is implemented as a stand alone class:

from xarrera.components import (
    DTypeSchema,
    DimsSchema,
    ShapeSchema,
    NameSchema,
    ChunksSchema,
    ArrayTypeSchema,
    AttrSchema,
    AttrsSchema
)

# example constructions
dtype_schema = DTypeSchema('i4')
dims_schema = DimsSchema(('x', 'y', None))  # None is used as a wildcard
shape_schema = ShapeSchema((5, 10, None))  # None is used as a wildcard
name_schema = NameSchema('foo')
chunk_schema = ChunksSchema({'x': None, 'y': -1})  # None is used as a wildcard, -1 is used as
ArrayTypeSchema = ArrayTypeSchema(np.ndarray)

# Example usage
dtype_schema.validate(da.dtype)

# Each object schema can be exported to JSON format
dtype_json = dtype_schema.to_json()

Roadmap

This is a very early prototype of a library, based on a library with multiple forks with small additions. The following key things to do are:

  • Contact former xarray-schema developers, forkers, and issue writers about xarrera.
  • ...
  • String comparison using regex xarray-schema~#9
  • Accumulate schema exceptions and report them all at once. Currently, we are a eagerly raising SchemaErrors when the are found.
  • Improve SchemaError reported information.
  • Extract schema from xarray objects xarray-schema~#45

Versioning

Version changes and descriptions are stored in the CHANGELOG. This file is updated each time a new version is released.

Contributing

Development Guide

  1. Install npm (required for pre-commit hooks)

    Some pre-commit hooks (e.g., for formatting or linting) depend on Node.js and npm. You can install them via your system package manager, or download from nodejs.org (npm is included).

    For example, on Debian/Ubuntu:

    sudo apt update
    sudo apt install nodejs npm
    

    If you use conda, you can also install it with:

    conda install -c conda-forge nodejs
    
  2. Install Pre-commit Hooks

    Install the pre-commit hooks to automatically check code styling:

    pre-commit install
    
  3. Install Python Dependencies

    Install the python package dependencies listed in the requirements.txt file, preferably in a python virtual enironment:

    pip install -r requirements.txt
    # or
    pip install -e .
    

Testing

  1. Install test dependencies

    Install de development dependencies with:

    pip install -r dev-requirements.txt
    # or
    pip install -e ".[dev]"
    
  2. Run the style check

    Run the mypy style check with:

    mypy xarrera tests
    
  3. Run the tests

    Run the tests with:

    pytest
    

    Generate the coverage with

    pytest --cov=./ --cov-report=xml --verbose
    

License

All the code in this repository is MIT licensed.

History

This project was originally developed at CarbonPlan. It was transferred to the xarray-contrib organization in August 2022.

Due to the inactivity in xarray-contrib, it was forked to Xarrera in March 2026.

Authors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xarrera-0.0.5.tar.gz (26.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xarrera-0.0.5-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file xarrera-0.0.5.tar.gz.

File metadata

  • Download URL: xarrera-0.0.5.tar.gz
  • Upload date:
  • Size: 26.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xarrera-0.0.5.tar.gz
Algorithm Hash digest
SHA256 6aff7335ec298400fd94e00b24418973792e1a49ed4cde7b7093b525e3e9e047
MD5 400f533108ea907164a74085b056f1a2
BLAKE2b-256 78223ac1e05ab3fd2db59a45d2c27788d1ea0f5b7ddd17cc9b4c002453994b97

See more details on using hashes here.

File details

Details for the file xarrera-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: xarrera-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xarrera-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 794f80af9a8deec6923c84eb4e8388930ad5174ded125c2d42723c70cbf48a9f
MD5 78933d902da5f010351b3e1a8a7163b9
BLAKE2b-256 e8610c1e23794b93974adb8d020361fe763de75a2fdd0b4623a49bdf37e49c7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page