End-to-end ontology embedding via fine-tuning sentence transformers with hyperbolic geometry and role-based rotation for existential restrictions.
Project description
Ontology-Transformer
End-to-end ontology embedding via fine-tuning sentence transformers with hyperbolic geometry and role-based rotation for existential restrictions (∃r.C).
Features
- One-line training:
OntologyTransformer.fit("ontology.owl")→ fine-tuned embeddings - Hyperbolic space: Poincaré ball embeddings for hierarchical structures
- Role-aware existential restrictions: ∃r.C encoded via learned rotation transformations
- Automatic data preparation: Converts OWL/OFN axioms to training data (no manual preprocessing)
- Best lambda auto-tuning: Centripetal weight optimized on evaluation data and saved with model
- Flexible evaluation: Use training ontology samples or separate eval/test ontologies
Installation
From PyPI (when published)
pip install ontology-transformer
From source
git clone https://github.com/your-username/ont-embed.git
cd ont-embed
pip install -e .
Requirements
- Python ≥ 3.9
- PyTorch ≥ 2.0 (with CUDA recommended)
sentence-transformers,geoopt,deeponto,datasets
Quick Start
1. End-to-end: OWL → Fine-tune → Embeddings
from ont import OntologyTransformer
# Train on any OWL/OFN ontology (all axioms used for training)
model = OntologyTransformer.fit(
owl_path="path/to/ontology.owl",
output_dir="./output",
num_epochs=3,
batch_size=64,
eval_ratio=0.1, # 10% of axioms sampled for evaluation
max_eval=1000, # max 1000 eval samples
)
# The best lambda (centripetal weight) is determined during training
print(f"Best lambda: {model.best_lambda}")
# Encode concepts
emb = model.encode("food product")
# Encode ∃r.C (existential restrictions) via role rotation
exist_emb = model.encode_existence("has ingredient", "sugar")
2. Use separate ontologies for evaluation/testing
model = OntologyTransformer.fit(
owl_path="train_ontology.owl",
eval_owl_path="eval_ontology.owl", # optional: separate eval ontology
test_owl_path="test_ontology.owl", # optional: separate test ontology
output_dir="./output",
num_epochs=3,
)
3. Load a pre-trained model
from ont import OntologyTransformer
# Load model (best_lambda is automatically restored)
model = OntologyTransformer.from_pretrained("./output/final")
print(f"Loaded best_lambda: {model.best_lambda}")
# Encode
emb = model.encode("heart disease")
exist_emb = model.encode_existence("has part", "cell membrane")
4. CLI
# Basic training
ont-train --owl ontology.owl --output ./output --epochs 3
# With separate eval ontology
ont-train --owl train.owl --eval-owl eval.owl --output ./output --epochs 3
# Balanced mode (adds C_neg contrastive loss)
ont-train --owl ontology.owl --output ./output --balanced --epochs 3
Data Preparation Flow
By default (no separate eval/test ontologies):
- All axioms from input ontology → training data (
train.jsonl,train_exist.jsonl,train_conj.jsonl) - 10% of axioms (max 1000) randomly sampled → evaluation data (
val.json) - No test split created (unless
test_owl_pathis provided)
With external eval/test ontologies:
eval_owl_path: evaluation data prepared from this ontologytest_owl_path: test evaluation performed after training
This design ensures all available training data is used while still enabling hyperparameter tuning (best lambda) via evaluation.
Training Modes
Non-balanced (default)
Standard contrastive loss on taxonomy + existential axioms:
- Clustering loss: push child closer to parent
- Centripetal loss: pull child away from non-ancestors
- Conjunction loss: C₁ ⊓ C₂ ⊑ D
- Existential loss: ∃r.C encoded via rotation
Balanced
Adds extra contrastive loss with negative concept samples (C_neg) for existential restrictions:
model = OntologyTransformer.fit(
owl_path="ontology.owl",
balanced=True,
balanced_negatives=5, # number of negative samples
)
Architecture
- Base model:
SentenceTransformerfine-tuned in Poincaré ball (hyperbolic space) - Role model: Linear layer mapping role embeddings to rotation angles (rotation or transition mode)
- Existential encoding: ∃r.C = rotate(embed(C), f_r(embed(r)))
- Best lambda: Centripetal weight λ optimized on eval data, saved in
wrapper_config.json
Model Saving & Loading
Models are saved with:
- Base sentence transformer weights
- Role model weights (
role_model.pt) - Configuration (
wrapper_config.json) includingbest_lambda - Concept/role vocabularies
# Save
model.save("./my_model")
# Load (best_lambda automatically restored)
loaded = OntologyTransformer.from_pretrained("./my_model")
Running Tests
# Install with test dependencies
pip install -e ".[test]"
# Run all tests
pytest tests/ -v
# Skip integration tests (large ontologies)
pytest tests/ -v -m "not integration"
# Run specific test
pytest tests/test_pipeline.py::TestPipeline::test_fit_tiny_owl -v
Examples
See examples/ directory for:
- Training on FoodOn, SNOMED CT, GALEN ontologies
- Evaluating embeddings for subsumption prediction
- Using external eval/test ontologies
Citation
If you use this package, please cite:
@inproceedings{yang2025language,
title={Language Models as Ontology Encoder},
author={Yang, Hui and Chen, Jiaoyan and Horrocks, Ian},
booktitle={International Semantic Web Conference (ISWC)},
year={2025},
organization={Springer}
}
GitHub: https://github.com/HuiYang1997/OnT
License
Apache License 2.0 - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ontology_transformer-0.1.2.tar.gz.
File metadata
- Download URL: ontology_transformer-0.1.2.tar.gz
- Upload date:
- Size: 30.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13d8a72514fff0b56a87d3292ae3793893f3f8a5338af2e2138a57c21d360573
|
|
| MD5 |
cb98f0fa57c769b71bdf0c921aed2a97
|
|
| BLAKE2b-256 |
2e7dc1d1a9ee405c3e16215f2e8eacd96feaf6e76e4912473fd07aebb5ed9d8a
|
File details
Details for the file ontology_transformer-0.1.2-py3-none-any.whl.
File metadata
- Download URL: ontology_transformer-0.1.2-py3-none-any.whl
- Upload date:
- Size: 30.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fbcc28c1d8cee8932380f1426a97e40e03c611c2c54547c72527ed0403839f53
|
|
| MD5 |
33fad2bf1378febde3ba0dc6bb4b97ed
|
|
| BLAKE2b-256 |
cdb3a18e43fe6d7a7633418a5080a798976e701e98857985a1143670e0bb5a81
|