Skip to main content

AI-powered metadata enhancement for Hasura DDN schema files

Project description

DDN Metadata Bootstrap

PyPI version Python versions License: MIT

AI-powered metadata enhancement for Hasura DDN (Data Delivery Network) schema files. Automatically generate descriptions and detect relationships in your YAML/HML schema definitions using advanced AI.

🚀 Features

  • 🤖 AI-Powered Descriptions: Generate natural language descriptions for schema elements using Anthropic's Claude
  • 🔗 Relationship Detection: Automatically detect and generate foreign key relationships
  • 📊 Domain Analysis: Intelligent analysis of business domains and terminology
  • ⚡ Batch Processing: Process entire directories of schema files efficiently
  • 🎯 DDN Optimized: Specifically designed for Hasura DDN schema structures
  • 🔧 Configurable: Extensive configuration options via environment variables or CLI

📦 Installation

From PyPI (Recommended)

pip install ddn-metadata-bootstrap

From Source

git clone https://github.com/hasura/ddn-metadata-bootstrap.git
cd ddn-metadata-bootstrap
pip install -e .

🏃 Quick Start

1. Set up your environment

export ANTHROPIC_API_KEY="your-anthropic-api-key"
export METADATA_BOOTSTRAP_INPUT_DIR="./input"
export METADATA_BOOTSTRAP_OUTPUT_DIR="./output"

2. Run the tool

# Process entire directory
ddn-metadata-bootstrap

# Or with CLI arguments
ddn-metadata-bootstrap --input-dir ./schema --output-dir ./enhanced --api-key YOUR_KEY

3. Or use as a Python package

from ddn_metadata_bootstrap import MetadataBootstrapper

bootstrapper = MetadataBootstrapper(
    api_key="your-anthropic-api-key",
    use_case="E-commerce platform"
)

# Process directory
bootstrapper.process_directory("./input", "./output")

# Get statistics
stats = bootstrapper.get_statistics()
print(f"Generated {stats['relationships_generated']} relationships")

📝 Example

Input HML File

kind: ObjectType
version: v1
definition:
  name: User
  fields:
    - name: id
      type: ID!
    - name: email
      type: String!
    - name: created_at
      type: String

Enhanced Output

kind: ObjectType
version: v1
definition:
  name: User
  description: |
    Represents a user account in the system with authentication
    and profile information.
  fields:
    - name: id
      type: ID!
      description: Unique identifier for the user account.
    - name: email
      type: String!
      description: User's email address for authentication and communication.
    - name: created_at
      type: String
      description: Timestamp when the user account was created.

⚙️ Configuration

Environment Variables

All configuration can be done via environment variables with the METADATA_BOOTSTRAP_ prefix:

# Required
ANTHROPIC_API_KEY=your_api_key_here

# Input/Output (choose one mode)
METADATA_BOOTSTRAP_INPUT_DIR=./input
METADATA_BOOTSTRAP_OUTPUT_DIR=./output

# OR single file mode
METADATA_BOOTSTRAP_INPUT_FILE=./schema.hml  
METADATA_BOOTSTRAP_OUTPUT_FILE=./enhanced.hml

# Optional
METADATA_BOOTSTRAP_USE_CASE="E-commerce platform"
METADATA_BOOTSTRAP_MODEL=claude-3-haiku-20240307
METADATA_BOOTSTRAP_FIELD_DESC_MAX_LENGTH=120
METADATA_BOOTSTRAP_KIND_DESC_MAX_LENGTH=250

CLI Arguments

ddn-metadata-bootstrap --help

Options:
  --input-dir PATH              Input directory containing HML files
  --output-dir PATH             Output directory for enhanced files
  --input-file PATH             Single input HML file
  --output-file PATH            Single output HML file
  --api-key TEXT                Anthropic API key
  --use-case TEXT               Business domain description
  --model TEXT                  AI model to use
  --field-max-length INTEGER    Max characters for field descriptions
  --kind-max-length INTEGER     Max characters for kind descriptions
  --verbose                     Enable verbose logging
  --dry-run                     Validate configuration without processing

🔄 What It Does

1. Description Generation

  • Analyzes schema element names and types
  • Generates contextual descriptions using AI
  • Respects character limits and style guidelines
  • Supports field-level and entity-level descriptions

2. Relationship Detection

  • Detects foreign key patterns (e.g., user_id, customer_id)
  • Identifies shared fields between entities
  • Generates bidirectional relationship definitions
  • Supports cross-subgraph relationships

3. Domain Analysis

  • Extracts business terminology from schema
  • Identifies domain-specific patterns
  • Provides contextual AI prompts
  • Supports domain-specific relationship hints

4. Schema Enhancement

  • Preserves original schema structure
  • Adds descriptions without breaking functionality
  • Generates proper DDN relationship definitions
  • Maintains YAML formatting and comments

🏗️ Architecture

The tool is built with a modular architecture:

  • ai/ - AI integration and description generation
  • schema/ - Schema analysis and metadata collection
  • relationships/ - Relationship detection and generation
  • processors/ - File and directory processing
  • utils/ - Text processing, YAML handling, path utilities

🧪 Testing

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=ddn_metadata_bootstrap

# Type checking
mypy ddn_metadata_bootstrap/

# Code formatting
black ddn_metadata_bootstrap/

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🆘 Support

🏷️ Version History

See CHANGELOG.md for version history and breaking changes.

⭐ Acknowledgments


Made with ❤️ by the Hasura team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddn_metadata_bootstrap-1.0.6.tar.gz (63.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ddn_metadata_bootstrap-1.0.6-py3-none-any.whl (72.0 kB view details)

Uploaded Python 3

File details

Details for the file ddn_metadata_bootstrap-1.0.6.tar.gz.

File metadata

  • Download URL: ddn_metadata_bootstrap-1.0.6.tar.gz
  • Upload date:
  • Size: 63.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for ddn_metadata_bootstrap-1.0.6.tar.gz
Algorithm Hash digest
SHA256 04db0c5285c88a106ae1b8426e307193307ea126c8663555016b23b067a00f2b
MD5 992a530012550cd1ce8e7ee5dfbd03b9
BLAKE2b-256 4ea46925e2680b293c0f467e73f3c869b2977487a8033ac30dac3ce49b525456

See more details on using hashes here.

File details

Details for the file ddn_metadata_bootstrap-1.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for ddn_metadata_bootstrap-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 7eabcf9470abd1a148c85c86d25860871ce54d8e22e6c21fb18dd17c5402b3de
MD5 a9664c7036ce6e7d081dc2729e6bd0bb
BLAKE2b-256 c5897304ddac2fd1b3958b7deab59b1dae0b2be525dfd68863fcaadd1b88e1eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page