Skip to main content

The package is intended to be reusable across projects by allowing any consumer to load proprietary controlled vocabularies stored in YAML files and validate/consume terms programmatically.

Project description

jps-controlled-vocabularies-utils

Build Publish to PyPI codecov

A standalone Python package for loading, managing, and validating controlled vocabularies stored in YAML files.

🚀 Overview

jps-controlled-vocabularies-utils provides a complete solution for managing controlled vocabularies in Python applications. It enables you to:

  • Define vocabularies in human-readable YAML files
  • Load and query vocabularies with a simple API
  • Validate values against term rules (allowed values, regex patterns)
  • Search terms by name, key, or synonyms
  • Get explainable validation results

Perfect for healthcare workflows, data pipelines, ETL validation, and any application requiring consistent terminology.

Features

  • YAML-backed vocabulary registry - Store vocabularies in version-controlled YAML files
  • Flexible parser - Load from files, directories, or in-memory strings
  • Comprehensive validation - Validate both registry integrity and runtime values
  • Pydantic models - Full type safety with Pydantic v2
  • Smart key derivation - Auto-generate stable keys from term names
  • Search capabilities - Prefix, contains, and exact matching with case sensitivity options
  • Explainable results - Detailed reasons for validation failures

Example Usage

from jps_controlled_vocabularies_utils import Parser, Validator

# Load vocabulary from YAML file
parser = Parser()
registry = parser.load_path("vocabularies/workflow_terms.yml")

# Query terms
vocab = registry.get_vocabulary("workflow.system_terminology")
term = registry.get_term("workflow.system_terminology", "readiness_status.ready")
print(f"{term.name}: {term.description}")

# Search terms
results = registry.search_terms("workflow.system_terminology", "ready")
print(f"Found {len(results)} matching terms")

# Validate values
validator = Validator()
result = validator.validate_value(
    registry,
    vocabulary_id="workflow.system_terminology",
    term_key="readiness_status.ready",
    value="Ready"
)

if result.is_valid:
    print("✓ Valid")
else:
    print(f"✗ Invalid: {', '.join(result.reasons)}")

📦 Installation

pip install jps-controlled-vocabularies-utils

Development Installation

git clone https://github.com/jai-python3/jps-controlled-vocabularies-utils.git
cd jps-controlled-vocabularies-utils
pip install -e ".[dev]"

🧪 Development

Setup

make install

Testing and Quality

# Run tests
make test

# Format and lint
make fix && make format && make lint

# Type checking
mypy src

📖 Documentation

See docs/ for detailed documentation including:

  • YAML schema reference
  • API documentation
  • Configuration options
  • Advanced usage examples

Quick YAML Example

schema_version: "1.0"
vocabulary_id: "workflow.system_terminology"
title: "Workflow Terminology"
description: "Core workflow terms"
terms:
  - key: readiness_status.ready
    name: "Ready"
    description: "All requirements satisfied"
    allowed_values: ["Ready", "ready"]
    tags: ["status"]

🛠️ Requirements

  • Python 3.10+
  • pydantic >= 2.0.0
  • pyyaml >= 6.0.0

📜 License

MIT License © Jaideep Sundaram

🤝 Contributing

Contributions welcome! Please open an issue or submit a pull request.

🔗 Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jps_controlled_vocabularies_utils-0.1.0.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file jps_controlled_vocabularies_utils-0.1.0.tar.gz.

File metadata

File hashes

Hashes for jps_controlled_vocabularies_utils-0.1.0.tar.gz
Algorithm Hash digest
SHA256 dc4a99c60dd278fb3ec5c95eadf0df6711f5c31a855f2c72bf93d23486f1cfd3
MD5 817082efb90453eae4621b027dd414eb
BLAKE2b-256 5c6cebe648568d60d54833e87b72dedc0fe8e2dc011dc5880393e69c491f5610

See more details on using hashes here.

File details

Details for the file jps_controlled_vocabularies_utils-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for jps_controlled_vocabularies_utils-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a544f696722bffc7a09c6d28cb9111adfba1440ee776024e155f1fa1ee37aade
MD5 4ac428f1812b67a9dba11780c37188d9
BLAKE2b-256 9e9c2bfc77a538ecd3aef72bf510a8d52573ea88a27b7c43a7846562a536d567

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page