Skip to main content

Generated from aind-library-template

Project description

aind-metadata-validator

This package includes helper functions for validating metadata from aind-data-schema, individual files in a metadata.json file, and the fields within each file.

All validation returns a MetadataState enum, see utils.py

Get status

By ID

Use aind-data-access-api [todo]

Full table

You can get the entire redshift status table by running:

from aind_data_access_api.rds_tables import RDSCredentials
from aind_data_access_api.rds_tables import Client
import pandas as pd

DEV_OR_PROD = "dev" if "test" in API_GATEWAY_HOST else "prod"
REDSHIFT_SECRETS = f"/aind/{DEV_OR_PROD}/redshift/credentials/readwrite"
RDS_TABLE_NAME = f"metadata_status_{DEV_OR_PROD}"

rds_client = Client(
        credentials=RDSCredentials(
            aws_secrets_name=REDSHIFT_SECRETS
        ),
    )

class MetadataState(int, Enum):
    VALID = 2  # validates as it's class
    PRESENT = 1  # present
    OPTIONAL = 0  # missing, but it's optional
    MISSING = -1  # missing, and it's required
    EXCLUDED = -2  # excluded for all modalities in the metadata
    CORRUPT = -3  # corrupt, can't be loaded from json


def _get_status() -> pd.DataFrame:
    """Get the status of the metadata
    """
    response = rds_client.read_table(RDS_TABLE_NAME)

    # returns int values, can be compared against MetadataState
    return response

Metadata validation

Returns a dictionary where each key is metadata, a file, or a file.field and the value is the MetadataState.

from aind_metadata_validator.metadata_validator import validate_metadata

m = Metadata()

results_df = validate_metadata(m.model_dump())

Redshift sync

Run on Code Ocean

Two code ocean capsules run the sync nightly: https://codeocean.allenneuraldynamics.org/capsule/0257223/tree and https://codeocean.allenneuraldynamics-test.org/capsule/3640490/tree

Run locally

The package also includes a function run() in sync.py that will validate the entire DocDB and push the results to redshift.

pip install aind-metadata-validator

from aind_metadata_validator.sync import run

run()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aind_metadata_validator-0.8.3.tar.gz (52.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aind_metadata_validator-0.8.3-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file aind_metadata_validator-0.8.3.tar.gz.

File metadata

  • Download URL: aind_metadata_validator-0.8.3.tar.gz
  • Upload date:
  • Size: 52.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for aind_metadata_validator-0.8.3.tar.gz
Algorithm Hash digest
SHA256 edc81b670b4aec036f41bba9173a6809992ac42593f12fb2d51d6395e50f0c1a
MD5 c020862e0d5b68e3fb388bc8d5f8def1
BLAKE2b-256 9020e42702f8af29909b264bb191fd9a5cb3c9607e34f2d51174d70e9e3cf513

See more details on using hashes here.

File details

Details for the file aind_metadata_validator-0.8.3-py3-none-any.whl.

File metadata

File hashes

Hashes for aind_metadata_validator-0.8.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e2e8a2a5c769c19e87fff41b60bc95f65a0329cb2c4e7049b98e11cc288f36ee
MD5 8b25bbb5e670982366697a93275c03f4
BLAKE2b-256 177e2ba840f35a0cd9ef17c980c054e38d8cd9050cdbd8c1e0291d6b9af2d2c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page