A lightweight ORM-style layer for Weaviate
Project description
Weaviate ORM
A Pythonic ORM-style layer for Weaviate
weaviate-orm provides a structured, object-oriented interface to the Weaviate vector database. Inspired by traditional ORM patterns, it allows you to define collections as Python classes, automatically generate Weaviate schemas, and perform CRUD and similarity-based queries without leaving the object-oriented paradigm.
Requirements
- Python 3.11+
- weaviate-client >= 4.16.0
- Weaviate server >= 1.27.0
- For examples using Ollama: a reachable Ollama instance with the "snowflake-arctic-embed2" embedding model available
Server Requirements
Weaviate 1.27.0 or higher is required. The weaviate-client >= 4.16.0 enforces this constraint. During engine initialization, the ORM performs a runtime check:
engine = Weaviate_Engine()
engine.create_all_schemas() # Raises RuntimeError if server version < 1.27.0
Why 1.27.0?
- Unified vector configuration API (
Configure.Vectors.*instead of deprecatedvectorizer_config) - Robust named vector support
- Improved schema handling and stability
If your server doesn't meet this requirement, the engine will raise a clear error:
RuntimeError: Weaviate server version 1.25.4 is not supported.
Please use Weaviate >= 1.27.0 to match the client requirements.
Features
-
Declarative Schema Definition
Define collections using Python classes and descriptors for properties and cross-references. -
Dynamic Schema Generation via Metaclass
A metaclass extracts property and reference definitions to auto-generate Weaviate-compatible schema structures. -
Polymorphic References ✨
Define references to base classes to accept objects from multiple subclass collections. Includes automatic type tracking and runtime type resolution. -
Object-Oriented CRUD Operations
Seamlessly create, retrieve, update, and delete data using instance methods — no raw queries needed. -
Recursive Save & Update
Handles deeply nested references and automatically saves or updates related objects when desired. -
Flexible Engine Binding
The engine takes any connection method from theweaviatelibrary, allowing support for local, remote, or custom configurations. -
UUID Management
Built-in support for bothuuid4anduuid5strategies — including UUID validation and immutability enforcement. -
Optional Auto-loading of References
Cross-references can be automatically resolved into full model instances when accessed. -
Support for Near-Vector and Near-Text Queries
Leverage Weaviate’s native similarity search APIs directly from your model classes. -
Strict Typing and Validation
Type enforcement and optional validation logic on properties and references using Python descriptors. -
Fully Tested
Includes both unit and integration tests with Pytest and Docker-based Weaviate setups.
Design Patterns
-
Descriptor Pattern
CustomPropertyandReferencedescriptors manage validation, casting, and schema representation of scalar and relational fields. -
Metaclass-Based Schema Introspection
A metaclass (Weaviate_Meta) dynamically collects all descriptors from a class and builds the full Weaviate schema. It also generates a dynamic__init__constructor based on the field signatures. -
Engine Abstraction Layer
TheWeaviate_Engineencapsulates connection logic and schema management. It supports any connection method compatible with theweaviatePython client (e.g., local, remote, or cloud). -
Instance-Oriented CRUD
CRUD operations (including recursive reference handling and UUID safety checks) are exposed as instance methods on models that inherit fromBase_Model. -
Reference Resolution Strategy
Referencedescriptors can be configured to auto-load full objects (not just UUIDs), and support both one-way and two-way references, as well as single or list cardinalities.
Installation
You can install weaviate_orm either from PyPI (once published) or directly from the source repository.
📦 From PyPI (recommended)
pip install weaviate-orm
🛠 From Git (development and testing)
git clone https://gitlab.opencode.de/bbsr_ida_public/weaviate_orm.git
cd weaviate_orm
pip install -e .
Note: Make sure your Python environment includes a compatible Weaviate client (>= 4.16.0):
pip install "weaviate-client>=4.16.0"
Quick Start
In this example, we will use a local Weaviate instance using text2vec_ollama as the default vectorizer with a local ollama instance running and providing "snowflake-arctic-embed2" as embedding model.
Create a Model (vector_config)
Create two related models Paper and Author and specify configuration for the weaviate schema.
from __future__ import annotations
import os
from uuid import UUID, uuid5
import datetime
from weaviate_orm.weaviate_base import Base_Model
from weaviate_orm.weaviate_property import Property
from weaviate_orm.weavitae_reference import Reference, Reference_Type
from weaviate.classes.config import Configure, DataType
# Get the host and port from environment variables
llm_host = os.getenv("LLM_HOST", "llm")
llm_port = int(os.getenv("LLM_PORT", 11434))
class Paper(Base_Model):
# Use _vector_config (new API) – replaces deprecated vectorizer_config
# Protected with underscore prefix to prevent accidental overrides
_vector_config = Configure.Vectors.text2vec_ollama(
api_endpoint = f"http://{llm_host}:{llm_port}", #api-endpoint for the local ollama model
model = "snowflake-arctic-embed2", #Embedding model to use
vectorize_collection_name = False
)
title = Property(cast_type=str, description="The title of the paper", required=True, weaviate_type=DataType.TEXT, vectorize_property_name=True)
abstract = Property(cast_type=str, description="The abstract of the paper", required=True, weaviate_type=DataType.TEXT, vectorize_property_name=True)
pub_date = Property(cast_type=datetime.date, description="The publication date of the paper", required=True, weaviate_type=DataType.DATE, skip_vectorization=True)
doi = Property(cast_type=str, description="The doi of the paper", required=True, weaviate_type=DataType.TEXT, skip_vectorization=True)
author = Reference(target_collection_name="Author", auto_loading=True, description="The author of the paper", reference_type=Reference_Type.SINGLE, way_type=Reference_Type.TWOWAY, required=False, skip_validation=True)
co_authors = Reference(target_collection_name="Author", auto_loading=False, description="The co-authors of the paper", reference_type=Reference_Type.LIST, way_type=Reference_Type.ONEWAY, required=False, skip_validation=True)
_namespace = UUID("eb8bc242-5f59-4a47-8230-0cea6fcc1028")
def _get_uuid_name_string(self):
return self.doi
class Author(Base_Model):
first_name = Property(cast_type=str, description="The first name of the author", required=True, weaviate_type=DataType.TEXT, vectorize_property_name=True)
last_name = Property(cast_type=str, description="The last name of the author", required=True, weaviate_type=DataType.TEXT, vectorize_property_name=True)
orc_id = Property(cast_type=str, description="The orc_id of the author", required=False, weaviate_type=DataType.TEXT, skip_vectorization=True)
papers = Reference(target_collection_name="Paper", auto_loading=False, description="The papers of the author", reference_type=Reference_Type.LIST, way_type=Reference_Type.TWOWAY, required=False, skip_validation=True)
_namespace = UUID("eb8bc242-5f59-4a47-8230-0cea6fcc1028")
def _get_uuid_name_string(self) -> str:
return f"{self.first_name} {self.last_name}"
Create a Weaviate Engine, Register Models, and Create Schema
from weaviate_orm.weaviate_engine import Weaviate_Engine
from weaviate import connect_to_local, connect_to_weaviate_cloud
# Get host and ports from environment variables
host = os.getenv("WEAVIATE_HOST", "vdatabase")
port = int(os.getenv("WEAVIATE_PORT", 8080))
grpc_port = int(os.getenv("WEAVIATE_GRPC_PORT", 50051))
# Initialize the Weaviate engine using the connect_to_local method and its parameters
engine = Weaviate_Engine(connect_to_local, host=host, port=port, grpc_port=grpc_port)
# Register the models with the engine
engine.register_all_models(Paper, Author)
# Create the schema in Weaviate
engine.create_all_schemas()
Create and Save Instances
# Example data
paper_data = {
"title": "A Study on Weaviate ORM",
"abstract": "This paper discusses the Weaviate ORM and its features.",
"pub_date": datetime.datetime.now(datetime.timezone.utc),
"doi": "10.1234/weaviate-orm",
}
author_data = {
"first_name": "Allen",
"last_name": "Turing",
"orc_id": "0000-0002-1234-5678",
}
# Create an insance of author and paper
author = Author(first_name=author_data["first_name"],
last_name=author_data["last_name"],
orc_id=author_data["orc_id"])
paper = Paper(title=paper_data["title"],
abstract=paper_data["abstract"],
pub_date=paper_data["pub_date"],
doi=paper_data["doi"],
author=author)
# Save the author and paper to Weaviate
paper.save(include_references=True, recursive=True)
Read an Instance
# Retrieve the paper by its UUID
paper_uuid = paper.get_uuid()
paper_instance = Paper.get(paper_uuid, include_references=True)
print(f"Paper Title: {paper_instance.title}")
print(f"Author: {paper_instance.author.first_name} {paper_instance.author.last_name}")
Update an Instance
# Update the paper's title
paper_instance.title = "An Updated Study on Weaviate ORM"
paper_instance.update()
#Check updated instance
paper_instance = Paper.get(paper_uuid, include_references=True)
print(f"Paper Title: {paper_instance.title}")
Delete an Instance
# Delete the paper instance
paper_instance.delete()
Schema Configuration Access
The ORM uses protected class-level attributes for schema configuration to prevent accidental overrides. All schema configs are prefixed with an underscore and accessed via read-only class properties.
Configuring Collections
Define schema configurations using the underscored attributes:
from weaviate.classes.config import Configure
class Article(Base_Model):
# Class-level schema configuration (protected with underscore)
_vector_config = Configure.Vectors.text2vec_ollama(
api_endpoint="http://llm:11434",
model="snowflake-arctic-embed2",
vectorize_collection_name=False
)
_description = "A collection of articles with vector embeddings"
_inverted_index_config = None # Optional index tuning
_generative_config = None # Optional generative configuration
Accessing Configurations
You can read schema configurations at both class and instance levels:
# Class-level access (read-only)
print(Article.vector_config) # Returns the configured vector
print(Article.description) # Returns "A collection of articles..."
# Instance-level access
article = Article(...)
print(article.vector_config) # Proxies to class-level config
Deprecation Path
Older code using non-underscored names will still work but emit a deprecation warning:
class LegacyModel(Base_Model):
vector_config = Configure.Vectors.text2vec_ollama(...) # DeprecationWarning
description = "Legacy" # DeprecationWarning
Migrate to the underscored versions to avoid future incompatibility:
class ModernModel(Base_Model):
_vector_config = Configure.Vectors.text2vec_ollama(...)
_description = "Modern"
Project Structure
weaviate_orm/
│
├── __init__.py # Public interface and versioning
├── weaviate_base.py # Base_Model with full CRUD logic and query support
├── weaviate_engine.py # Manages Weaviate client and schema creation
├── weaviate_meta.py # Metaclass for schema extraction and dynamic __init__
├── weaviate_property.py # Descriptor for scalar fields
├── weavitae_reference.py # Descriptor for references (one-way, two-way, single, list)
├── weaviate_decorators.py # Client injection and async-to-sync conversion
└── weaviate_utility.py # Helpers for validation and reference comparison
License
This project is licensed under the GNU General Public License v3.0 (GPLv3).
You are free to use, modify, and distribute this software under the terms of the GPLv3 license. Any derivative work must also be distributed under the same license.
For full details, see the LICENSE.
Contributing & Credits
This project is created and maintained by the BBSR - IDA (Tobias Heimig-Elschner). Feel free to open issues or submit pull requests if you encounter bugs, have ideas, or want to improve the package.
📚 Citation
The project is published on Zenodo and can be cited as: Heimig, T. (2025). Weaviate ORM - (0.1.0). Bundesinstitut für Bau-, Stadt- und Raumforschung (BBSR). https://doi.org/10.58007/x1wa-rt92
Roadmap & Open Development Topics
-
Batch Operations
Add support for batched inserts and updates for high-throughput use cases. -
Generalized Query Interface
Unify near-vector, near-text, and filter queries into a common, fluent interface. -
Nested Reference Updates
Extend update logic to fully support reference deletions and new nested reference creation during .update() calls.
Testing
Unit tests
Run the unit test suite:
pytest tests/unit -q
Integration tests
Integration tests require a running Weaviate (>= 1.27.0) and, for Ollama-based examples, a reachable LLM service. Using the included docker-compose setup:
# From repository root
docker-compose build vdatabase llm
docker-compose up -d vdatabase llm
# Run integration tests once services are healthy
pytest tests/integration -q -m integration
If you maintain your own Weaviate instance, set these environment variables so the engine can connect:
export WEAVIATE_HOST=vdatabase
export WEAVIATE_PORT=8080
export WEAVIATE_GRPC_PORT=50051
Migration note
The Weaviate Python client has deprecated vectorizer_config in favor of vector_config. This project now uses vector_config everywhere, including named vectors (e.g., Configure.Vectors.text2vec_ollama(...)). Ensure your server and client meet the versions above to avoid startup or schema warnings.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file weaviate_orm-0.1.30.tar.gz.
File metadata
- Download URL: weaviate_orm-0.1.30.tar.gz
- Upload date:
- Size: 54.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0e98ff684f9b13afc713524029566fc26df2cd0cba3a8c4d34f76b62774786d
|
|
| MD5 |
daf5d32394179c51fa2fc64042d2467f
|
|
| BLAKE2b-256 |
b21907949f0838f07f4c95fa111620023fe886be34e193e9a76b9dfcd01d4e30
|
File details
Details for the file weaviate_orm-0.1.30-py3-none-any.whl.
File metadata
- Download URL: weaviate_orm-0.1.30-py3-none-any.whl
- Upload date:
- Size: 52.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4610dd7af3539c6ea54e0b1bdbd08676b60e2b92fa8d2dab65e8b649710a4ac2
|
|
| MD5 |
8df617b888ff7d5d9be8c2a9128e983e
|
|
| BLAKE2b-256 |
6c67258d8b5de9b65ddd8ad7c9c67315eebe647a2c92ae4d1ca1d6e285e85ad9
|