Skip to main content

Generated from aind-library-template

Project description

aind-metadata-validator

This package includes helper functions for validating metadata from aind-data-schema, individual files in a metadata.json file, and the fields within each file.

All validation returns a MetadataState enum, see utils.py

Get status

By ID

Use aind-data-access-api [todo]

Full table

You can get the entire redshift status table by running:

from aind_data_access_api.rds_tables import RDSCredentials
from aind_data_access_api.rds_tables import Client
import pandas as pd

DEV_OR_PROD = "dev" if "test" in API_GATEWAY_HOST else "prod"
REDSHIFT_SECRETS = f"/aind/{DEV_OR_PROD}/redshift/credentials/readwrite"
RDS_TABLE_NAME = f"metadata_status_{DEV_OR_PROD}"

rds_client = Client(
        credentials=RDSCredentials(
            aws_secrets_name=REDSHIFT_SECRETS
        ),
    )

class MetadataState(int, Enum):
    VALID = 2  # validates as it's class
    PRESENT = 1  # present
    OPTIONAL = 0  # missing, but it's optional
    MISSING = -1  # missing, and it's required
    EXCLUDED = -2  # excluded for all modalities in the metadata
    CORRUPT = -3  # corrupt, can't be loaded from json


def _get_status() -> pd.DataFrame:
    """Get the status of the metadata
    """
    response = rds_client.read_table(RDS_TABLE_NAME)

    # returns int values, can be compared against MetadataState
    return response

Metadata validation

Returns a dictionary where each key is metadata, a file, or a file.field and the value is the MetadataState.

from aind_metadata_validator.metadata_validator import validate_metadata

m = Metadata()

results_df = validate_metadata(m.model_dump())

Redshift sync

Run on Code Ocean

Two code ocean capsules run the sync nightly: https://codeocean.allenneuraldynamics.org/capsule/0257223/tree and https://codeocean.allenneuraldynamics-test.org/capsule/3640490/tree

Run locally

The package also includes a function run() in sync.py that will validate the entire DocDB and push the results to redshift.

pip install aind-metadata-validator

from aind_metadata_validator.sync import run

run()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aind_metadata_validator-0.7.6.tar.gz (51.4 kB view details)

Uploaded Source

Built Distribution

aind_metadata_validator-0.7.6-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file aind_metadata_validator-0.7.6.tar.gz.

File metadata

File hashes

Hashes for aind_metadata_validator-0.7.6.tar.gz
Algorithm Hash digest
SHA256 3bc8cba929fd5f8d7dc132c0e01f25ff94f2fec5beb21e85773485c479642892
MD5 0394be35307320cc857b7530ad8d4e02
BLAKE2b-256 b6a9f05ea6444c6580a2b0b7e79f13217e85b55055bb5b7bb139bf003f062046

See more details on using hashes here.

File details

Details for the file aind_metadata_validator-0.7.6-py3-none-any.whl.

File metadata

File hashes

Hashes for aind_metadata_validator-0.7.6-py3-none-any.whl
Algorithm Hash digest
SHA256 38d201094b7e556d3443c667278ea385d3ddf7027291c341be9a4427e1bddc5c
MD5 3510e6b7fed4013c22140d48e0fb86b7
BLAKE2b-256 27197d8e8df7b42aa9f44290bec862facd4250d52318b65c33ca214fc9f27eec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page