Skip to main content

This is the project description.

Project description

Resource Ingest Guide Schema

A LinkML schema for describing Reference Ingest Guides (RIGs) - structured documents that capture the scope, rationale, and modeling approach for ingesting content from external sources into Biolink Model-compliant data repositories.

Overview

This repository provides:

  • LinkML Schema: Formal specification for Reference Ingest Guides in src/resource_ingest_guide_schema/schema/
  • Documentation Generator: Automated conversion of RIG YAML files to human-readable markdown
  • Validation Tools: Schema validation for RIG files using LinkML
  • Template System: Standardized templates and creation tools for new RIGs
  • Example RIGs: Real-world examples from CTD, DISEASES, and Clinical Trials KP

What are Reference Ingest Guides (RIGs)?

RIGs are structured documents that describe:

  • Source Information: Details about data sources (access, formats, licensing)
  • Ingest Information: What content is included/excluded and filtering rationale
  • Target Information: How data is modeled in the output knowledge graph
  • Provenance Information: Contributors and related artifacts

RIGs help ensure reproducible, well-documented data ingestion processes for biomedical knowledge graphs.

Website

https://biolink.github.io/resource-ingest-guide-schema

Repository Structure

├── src/
│   ├── resource_ingest_guide_schema/
│   │   └── schema/                    # LinkML schema definition
│   ├── docs/
│   │   ├── files/                     # Static documentation files
│   │   ├── rigs/                      # Example RIG YAML files
│   │   └── doc-templates/             # Jinja2 templates for docs
│   └── scripts/                       # Python utilities for RIG processing
├── docs/                              # Generated documentation
├── tests/                             # Test suite
└── project/                           # Generated LinkML artifacts

Developer Documentation

Prerequisites

This project uses uv for dependency management. Install it with:

# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

# Or with pip
pip install uv

Getting Started

Note that the following commands assume you are in the project root directory, and the equivalent just commands may be substituted for several make targets (namely just test instead of make test)

  1. Install dependencies:

    uv sync --extra dev
    
  2. Run tests:

    make test  # or just test
    
  3. Generate documentation:

    make gendoc
    

Working with RIGs

Creating a New RIG

# Create a new RIG from the template
make new-rig INFORES=infores:example NAME="Example Data Source"

# This creates src/docs/rigs/mydatasource_rig.yaml
# Edit the file to fill in your specific information

or using the equivalent just command:

just INFORES=infores:example NAME="Example Data Source" new-rig 

Note that for the just command, the script variables must precede the just recipe ("target") name on the command line (reverse of the make command).

Validating RIGs

# Validate all RIG files against the schema
make validate-rigs  

or

just validate-rigs

To validate a specific RIG:

uv run linkml-validate --schema src/resource_ingest_guide_schema/schema/resource_ingest_guide_schema.yaml src/docs/rigs/my_rig.yaml

Building Documentation

# Generate all documentation including RIG index and markdown versions
make gendoc

# Test documentation locally
make testdoc  # Builds docs and starts local server

Development Workflow

1. Schema Development

The LinkML schema is defined in src/resource_ingest_guide_schema/schema/resource_ingest_guide_schema.yaml. After making changes:

# Regenerate Python datamodel and other artifacts
make gen-project

# Test the schema
make test-schema

# Lint the schema
make lint

2. Script Development

Python utilities are in src/scripts/:

  • create_rig.py: Generate new RIG from template
  • rig_to_markdown.py: Convert RIG YAML to Markdown
  • generate_rig_index.py: Create RIG index table

To test script changes:

# Run scripts directly
uv run python src/scripts/create_rig.py --help
uv run python src/scripts/rig_to_markdown.py --input-dir src/docs/rigs --output-dir docs

3. Documentation Development

Templates are in src/docs/doc-templates/ and static files in src/docs/files/:

# Regenerate docs after template changes
make gendoc

# View changes locally
make serve  # or make testdoc

Available Commands

Note: some make targets (like new-rig and validate-rigs) have just command equivalents (remember instead to put the just recipe target name after any command line arguments)

Command Description
make help Show all available commands
make install Install dependencies with uv
make test Run full test suite
make test-schema Test schema generation
make test-python Run Python tests
make lint Lint the LinkML schema
make gen-project Generate LinkML artifacts (Python, JSON Schema, etc.)
make gendoc Generate documentation including RIG processing
make serve Start local documentation server
make testdoc Build docs and start server
make new-rig Create new RIG (requires INFORES and NAME)
make validate-rigs Validate all RIG files
make clean Clean generated files
make deploy Deploy documentation

Project Structure Details

Key Directories

  • src/resource_ingest_guide_schema/schema/: LinkML schema definition
  • src/docs/rigs/: Example RIG YAML files (CTD, DISEASES, Clinical Trials KP)
  • src/docs/files/: Static documentation files copied to output
  • src/docs/doc-templates/: Jinja2 templates for documentation generation
  • src/scripts/: Python utilities for RIG creation and processing
  • docs/: Generated documentation output (do not edit directly)
  • project/: Generated LinkML artifacts (Python models, JSON Schema, etc.)

Generated Artifacts

The make gen-project command generates:

  • Python datamodel: src/resource_ingest_guide_schema/datamodel/
  • JSON Schema: project/jsonschema/
  • OWL ontology: project/owl/
  • GraphQL schema: project/graphql/
  • SQL DDL: project/sqlschema/
  • And more: See project/ directory

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make changes following the existing patterns
  4. Ensure tests pass: make test
  5. Update documentation if needed: make gendoc
  6. Submit a pull request

Adding New RIG Examples

  1. Create YAML file in src/docs/rigs/
  2. Follow the schema structure (see existing examples)
  3. Validate: make validate-rigs
  4. Regenerate docs: make gendoc
  5. The RIG will automatically appear in the documentation index

Schema Changes

  1. Modify src/resource_ingest_guide_schema/schema/resource_ingest_guide_schema.yaml
  2. Regenerate artifacts: make gen-project
  3. Update any affected RIG files
  4. Test: make test
  5. Update documentation as needed

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

resource_ingest_guide_schema-0.1.3.tar.gz (2.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

resource_ingest_guide_schema-0.1.3-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file resource_ingest_guide_schema-0.1.3.tar.gz.

File metadata

File hashes

Hashes for resource_ingest_guide_schema-0.1.3.tar.gz
Algorithm Hash digest
SHA256 1802d5eb0c335807c312d136dc580bf1fb7aea4403d48e7b3b6329927643ac0a
MD5 ccc4f2a5d4688825c35e4193c25061f2
BLAKE2b-256 f9762d43a740849592f649c566500197019f248a1509c7458d5ba5334f4ca108

See more details on using hashes here.

Provenance

The following attestation bundles were made for resource_ingest_guide_schema-0.1.3.tar.gz:

Publisher: pypi-publish.yaml on biolink/resource-ingest-guide-schema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file resource_ingest_guide_schema-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for resource_ingest_guide_schema-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 86a95607a7b584bc086ee9d66599f9225f1870283c65b87833ef02726909e2ae
MD5 6e477418be31566c5cbf7336fcf34211
BLAKE2b-256 c3c5f184c4d43260c9ec193571c4819fffcb222ddfa2e52c468ba0397e20a887

See more details on using hashes here.

Provenance

The following attestation bundles were made for resource_ingest_guide_schema-0.1.3-py3-none-any.whl:

Publisher: pypi-publish.yaml on biolink/resource-ingest-guide-schema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page