Skip to main content

Python package for validating datasets in the microdata platform

Project description

microdata-validator

Python package for validating datasets in the microdata platform.

Dataset description

A dataset as defined in microdata consists of one data file, and one metadata file.

The data file is a csv-file seperated by semicolons. A valid example would be:

000000000000001;123;2020-01-01;2020-12-31;
000000000000002;123;2020-01-01;2020-12-31;
000000000000003;123;2020-01-01;2020-12-31;
000000000000004;123;2020-01-01;2020-12-31;

Read more about the data format and columns in the documentation.

The metadata files should be in json format. The requirements for the metadata is best described through the json schema, the examples, and the documentation.

Basic usage

Once you have your metadata and data files ready to go, they should be named and stored like this:

my-input-directory/
    MY_DATASET_NAME/
        MY_DATASET_NAME.csv
        MY_DATASET_NAME.json

Install microdata-validator through pip:

pip install microdata-validator

Import microdata-validator in your script and validate your files:

from microdata_validator import validate

validation_errors = validate(
    "my-dataset-name",
    input_directory="path/to/my-input-directory"
)

if not validation_errors:
    print("My dataset is valid")
else:
    print("Dataset is invalid :(")
    # You can print your errors like this:
    for error in validation_errors:
        print(error)

For a more in-depth explanation of usage visit the usage documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

microdata-validator-3.1.1.tar.gz (16.1 kB view hashes)

Uploaded Source

Built Distribution

microdata_validator-3.1.1-py3-none-any.whl (18.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page