Skip to main content

Schema validation for Xarray objects

Project description

xarray-schema

Schema validation for Xarray

CI codecov MIT License

installation

Install xarray-schema from PyPI:

pip install xarray-schema

Conda:

conda install -c conda-forge xarray-schema

Or install it from source:

pip install git+https://github.com/carbonplan/xarray-schema

usage

Xarray-schema's API is modeled after Pandera. The DataArraySchema and DatasetSchema objects both have .validate() methods.

The basic usage is as follows:

import numpy as np
import xarray as xr
from xarray_schema import DataArraySchema, DatasetSchema, CoordsSchema

da = xr.DataArray(np.ones(4, dtype='i4'), dims=['x'], name='foo')

schema = DataArraySchema(dtype=np.integer, name='foo', shape=(4, ), dims=['x'])

schema.validate(da)

You can also use it to validate a Dataset like so:

schema_ds = DatasetSchema({'foo': schema})

schema_ds.validate(da.to_dataset())

Each component of the Xarray data model is implemented as a stand alone class:

from xarray_schema.components import (
    DTypeSchema,
    DimsSchema,
    ShapeSchema,
    NameSchema,
    ChunksSchema,
    ArrayTypeSchema,
    AttrSchema,
    AttrsSchema
)

# example constructions
dtype_schema = DTypeSchema('i4')
dims_schema = DimsSchema(('x', 'y', None))  # None is used as a wildcard
shape_schema = ShapeSchema((5, 10, None))  # None is used as a wildcard
name_schema = NameSchema('foo')
chunk_schema = ChunkSchema({'x': None, 'y': -1})  # None is used as a wildcard, -1 is used as
ArrayTypeSchema = ArrayTypeSchema(np.ndarray)

# Example usage
dtype_schama.validate(da.dtype)

# Each object schema can be exported to JSON format
dtype_json = dtype_schama.to_json()

roadmap

This is a very early prototype of a library. Some key things are missing:

  1. Validation of coords and attrs. These are not implemented yet.
  2. Exceptions: Pandera accumulates schema exceptions and reports them all at once. Currently, we are a eagerly raising SchemaErrors when the are found.
  3. Roundtrip schemas to/from JSON and/or YAML format.

license

All the code in this repository is MIT licensed, but we request that you please provide attribution if reusing any of our digital content (graphics, logo, articles, etc.).

about us

CarbonPlan is a non-profit organization that uses data and science for climate action. We aim to improve the transparency and scientific integrity of climate solutions through open data and tools. Find out more at carbonplan.org or get in touch by opening an issue or sending us an email.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xarray-schema-0.0.3.tar.gz (15.0 kB view hashes)

Uploaded Source

Built Distribution

xarray_schema-0.0.3-py3-none-any.whl (10.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page