Unified data models and interfaces for syntactic and semantic frame ontologies
Project description
Glazing
Unified data models and interfaces for syntactic and semantic frame ontologies.
Features
- 🚀 One-command setup:
glazing initdownloads and prepares all datasets - 📦 Type-safe models: Pydantic v2 validation for all data structures
- 🔍 Unified search: Query across all datasets with consistent API
- 🔗 Cross-references: Automatic mapping between resources with confidence scores
- 🎯 Fuzzy search: Find data with typos, spelling variants, and inconsistencies
- 🐳 Docker support: Use via Docker without local installation
- 💾 Efficient storage: JSON Lines format with streaming support
- 🐍 Modern Python: Full type hints, Python 3.13+ support
Installation
Via pip
pip install glazing
Via Docker
Build and run Glazing in a containerized environment:
# Build the image
git clone https://github.com/aaronstevenwhite/glazing.git
cd glazing
docker build -t glazing:latest .
# Initialize datasets (persisted in volume)
docker run --rm -v glazing-data:/data glazing:latest init
# Use the CLI
docker run --rm -v glazing-data:/data glazing:latest search query "give"
docker run --rm -v glazing-data:/data glazing:latest search query "transfer" --fuzzy
# Interactive Python session
docker run --rm -it -v glazing-data:/data --entrypoint python glazing:latest
See the installation docs for more Docker usage examples.
Quick Start
Initialize all datasets (one-time setup, ~54MB download):
glazing init
Then start using the data:
from glazing.search import UnifiedSearch
# Automatically uses default data directory after 'glazing init'
search = UnifiedSearch()
results = search.search("give")
for result in results[:5]:
print(f"{result.dataset}: {result.name} - {result.description}")
CLI Usage
Search across datasets:
# Search all datasets
glazing search query "abandon"
# Search specific dataset
glazing search query "run" --dataset verbnet
# Find data with typos or spelling variants
glazing search query "realize" --fuzzy
glazing search query "organize" --fuzzy --threshold 0.8
Resolve cross-references:
# Extract cross-reference index (one-time setup)
glazing xref extract
# Find cross-references
glazing xref resolve "give.01" --source propbank
glazing xref resolve "give-13.1" --source verbnet
# Find data with variations or inconsistencies
glazing xref resolve "realize.01" --source propbank --fuzzy
Python API
Load and work with individual datasets:
from glazing.framenet.loader import FrameNetLoader
from glazing.verbnet.loader import VerbNetLoader
# Loaders automatically use default paths and load data after 'glazing init'
fn_loader = FrameNetLoader() # Data is already loaded
frames = fn_loader.frames
vn_loader = VerbNetLoader() # Data is already loaded
verb_classes = list(vn_loader.classes.values())
Cross-reference resolution:
from glazing.references.index import CrossReferenceIndex
# Automatic extraction on first use (cached for future runs)
xref = CrossReferenceIndex()
# Resolve references for a PropBank roleset
refs = xref.resolve("give.01", source="propbank")
print(f"VerbNet classes: {refs['verbnet_classes']}")
print(f"Confidence scores: {refs['confidence_scores']}")
# Find data with variations or inconsistencies
refs = xref.resolve("realize.01", source="propbank", fuzzy=True)
print(f"Found match with fuzzy search: {refs['verbnet_classes']}")
Fuzzy search in Python:
from glazing.search import UnifiedSearch
# Find data with typos or spelling variants
search = UnifiedSearch()
results = search.search_with_fuzzy("organize", fuzzy_threshold=0.8)
for result in results[:5]:
print(f"{result.dataset}: {result.name} (score: {result.score:.2f})")
Supported Datasets
- FrameNet 1.7: Semantic frames and frame elements
- PropBank 3.4: Predicate-argument structures
- VerbNet 3.4: Verb classes with thematic roles
- WordNet 3.1: Synsets and lexical relations
Documentation
Full documentation available at https://glazing.readthedocs.io.
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
# Development setup
git clone https://github.com/aaronstevenwhite/glazing
cd glazing
pip install -e ".[dev]"
Citation
If you use Glazing in your research, please cite:
@software{glazing2025,
author = {White, Aaron Steven},
title = {Glazing: Unified Data Models and Interfaces for Syntactic and Semantic Frame Ontologies},
year = {2025},
url = {https://github.com/aaronstevenwhite/glazing},
doi = {10.5281/zenodo.17185626}
}
License
This package is licensed under an MIT License. See LICENSE file for details.
Links
Acknowledgments
This project was funded by a National Science Foundation (BCS-2040831) and builds upon the foundational work of the FrameNet, PropBank, VerbNet, and WordNet teams.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file glazing-0.2.1.tar.gz.
File metadata
- Download URL: glazing-0.2.1.tar.gz
- Upload date:
- Size: 219.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4e4c48970ef1dbeb5cb21aa9b88ac46684e1c12e3ad466d11accdf9970760f8
|
|
| MD5 |
18bb60fe06e5ae821cab0d10ed4876a8
|
|
| BLAKE2b-256 |
32ee388d5581e302d29a59bd66740ff22845f07c14451487585c1d973b0260f0
|
File details
Details for the file glazing-0.2.1-py3-none-any.whl.
File metadata
- Download URL: glazing-0.2.1-py3-none-any.whl
- Upload date:
- Size: 230.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16397b7645f91a25216ef08cc7b25ab57fd77e6c96c51ad2fe4e66e846475a50
|
|
| MD5 |
0516ddc8b2d14b746766f2c11e8972a6
|
|
| BLAKE2b-256 |
30f3e27591686e0de61e5e96d49a96adf621c1ea03fbd26d330cd66b35308e9b
|