Skip to main content

Unified data models and interfaces for syntactic and semantic frame ontologies

Project description

Glazing

PyPI version Python versions CI Documentation License DOI

Unified data models and interfaces for syntactic and semantic frame ontologies.

Features

  • 🚀 One-command setup: glazing init downloads and prepares all datasets
  • 📦 Type-safe models: Pydantic v2 validation for all data structures
  • 🔍 Unified search: Query across all datasets with consistent API
  • 🔗 Cross-references: Automatic mapping between resources with confidence scores
  • 🎯 Fuzzy search: Find matches even with typos or partial queries
  • 🐳 Docker support: Use via Docker without local installation
  • 💾 Efficient storage: JSON Lines format with streaming support
  • 🐍 Modern Python: Full type hints, Python 3.13+ support

Installation

Via pip

pip install glazing

Via Docker

Build and run Glazing in a containerized environment:

# Build the image
git clone https://github.com/aaronstevenwhite/glazing.git
cd glazing
docker build -t glazing:latest .

# Initialize datasets (persisted in volume)
docker run --rm -v glazing-data:/data glazing:latest init

# Use the CLI
docker run --rm -v glazing-data:/data glazing:latest search query "give"
docker run --rm -v glazing-data:/data glazing:latest search query "transfer" --fuzzy

# Interactive Python session
docker run --rm -it -v glazing-data:/data --entrypoint python glazing:latest

See the installation docs for more Docker usage examples.

Quick Start

Initialize all datasets (one-time setup, ~54MB download):

glazing init

Then start using the data:

from glazing.search import UnifiedSearch

# Automatically uses default data directory after 'glazing init'
search = UnifiedSearch()
results = search.search("give")

for result in results[:5]:
    print(f"{result.dataset}: {result.name} - {result.description}")

CLI Usage

Search across datasets:

# Search all datasets
glazing search query "abandon"

# Search specific dataset
glazing search query "run" --dataset verbnet

# Use fuzzy search for typos
glazing search query "giv" --fuzzy
glazing search query "instrment" --fuzzy --threshold 0.7

Resolve cross-references:

# Extract cross-reference index (one-time setup)
glazing xref extract

# Find cross-references
glazing xref resolve "give.01" --source propbank
glazing xref resolve "give-13.1" --source verbnet

# Use fuzzy matching
glazing xref resolve "giv.01" --source propbank --fuzzy

Python API

Load and work with individual datasets:

from glazing.framenet.loader import FrameNetLoader
from glazing.verbnet.loader import VerbNetLoader

# Loaders automatically use default paths and load data after 'glazing init'
fn_loader = FrameNetLoader()  # Data is already loaded
frames = fn_loader.frames

vn_loader = VerbNetLoader()  # Data is already loaded
verb_classes = list(vn_loader.classes.values())

Cross-reference resolution:

from glazing.references.index import CrossReferenceIndex

# Automatic extraction on first use (cached for future runs)
xref = CrossReferenceIndex()

# Resolve references for a PropBank roleset
refs = xref.resolve("give.01", source="propbank")
print(f"VerbNet classes: {refs['verbnet_classes']}")
print(f"Confidence scores: {refs['confidence_scores']}")

# Use fuzzy matching for typos
refs = xref.resolve("giv.01", source="propbank", fuzzy=True)
print(f"Found match with fuzzy search: {refs['verbnet_classes']}")

Fuzzy search in Python:

from glazing.search import UnifiedSearch

# Use fuzzy search to handle typos
search = UnifiedSearch()
results = search.search_with_fuzzy("instrment", fuzzy_threshold=0.8)

for result in results[:5]:
    print(f"{result.dataset}: {result.name} (score: {result.score:.2f})")

Supported Datasets

  • FrameNet 1.7: Semantic frames and frame elements
  • PropBank 3.4: Predicate-argument structures
  • VerbNet 3.4: Verb classes with thematic roles
  • WordNet 3.1: Synsets and lexical relations

Documentation

Full documentation available at https://glazing.readthedocs.io.

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

# Development setup
git clone https://github.com/aaronstevenwhite/glazing
cd glazing
pip install -e ".[dev]"

Citation

If you use Glazing in your research, please cite:

@software{glazing2025,
  author = {White, Aaron Steven},
  title = {Glazing: Unified Data Models and Interfaces for Syntactic and Semantic Frame Ontologies},
  year = {2025},
  url = {https://github.com/aaronstevenwhite/glazing},
  doi = {10.5281/zenodo.17185626}
}

License

This package is licensed under an MIT License. See LICENSE file for details.

Links

Acknowledgments

This project was funded by a National Science Foundation (BCS-2040831) and builds upon the foundational work of the FrameNet, PropBank, VerbNet, and WordNet teams.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glazing-0.2.0.tar.gz (217.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

glazing-0.2.0-py3-none-any.whl (229.0 kB view details)

Uploaded Python 3

File details

Details for the file glazing-0.2.0.tar.gz.

File metadata

  • Download URL: glazing-0.2.0.tar.gz
  • Upload date:
  • Size: 217.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for glazing-0.2.0.tar.gz
Algorithm Hash digest
SHA256 1becf7f165fcd25752195e6d8d8f9d06f6e8bf00732a3940b0cc38d1ddfbc9ff
MD5 5a13a9f7dd419cc11ca1d29404f3e37f
BLAKE2b-256 f0e9f57de170e9265fa21279c5e2f5ca1c879de74680bcb95e3c5ca27464dd03

See more details on using hashes here.

File details

Details for the file glazing-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: glazing-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 229.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for glazing-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b04f1fac9805d12e3376339b25c1796c1a0624ef9599bf22fc338a1c67bfb306
MD5 388f25d49c7c225b3e660a890c50b62f
BLAKE2b-256 c48412ef9ee631190c263818003e2ca1fd76ef135098563a4988b85d39b0d6ed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page