A framework for LLM-powered document pre-indexing and hybrid retrieval.
Project description
Ennoia
A framework for LLM-powered document pre-indexing and hybrid retrieval.
Ennoia treats indexing as a first-class problem. Instead of embedding raw text and hoping vector similarity recovers relevance, you declare extraction schemas (Pydantic models for structured metadata, marker classes for semantic summaries), and Ennoia runs an LLM-driven DAG over each document to produce rich, filterable indices. At query time, retrieval runs in two phases: structured filters narrow the candidate set, then vector search ranks within it.
Install
pip install "ennoia[ollama,sentence-transformers,cli]"
Available extras: ollama, openai, anthropic, sentence-transformers,
filesystem (Parquet + NumPy stores), cli (ennoia CLI), all
(everything above).
Quick start (SDK)
from datetime import date
from typing import Literal
from ennoia import BaseSemantic, BaseStructure, Pipeline, Store
from ennoia.adapters.embedding.sentence_transformers import SentenceTransformerEmbedding
from ennoia.adapters.llm.ollama import OllamaAdapter
from ennoia.store import InMemoryStructuredStore, InMemoryVectorStore
class DocMeta(BaseStructure):
"""Extract basic document metadata."""
category: Literal["legal", "medical", "financial"]
doc_date: date
class Summary(BaseSemantic):
"""What is the main topic of this document?"""
pipeline = Pipeline(
schemas=[DocMeta, Summary],
store=Store(vector=InMemoryVectorStore(), structured=InMemoryStructuredStore()),
llm=OllamaAdapter(model="qwen3:0.6b"),
embedding=SentenceTransformerEmbedding(model="all-MiniLM-L6-v2"),
)
pipeline.index(text="The court held that...", source_id="doc_001")
results = pipeline.search(
query="court holdings on liability",
filters={"category": "legal"},
top_k=5,
)
See docs/quickstart.md for the full walkthrough.
Quick start (CLI)
# Iterate on a schema against a single document
ennoia try ./sample.txt --schema my_schemas.py
# Index a folder into a filesystem-backed store
ennoia index ./docs \
--schema my_schemas.py \
--store ./my_index \
--llm ollama:qwen3:0.6b \
--embedding sentence-transformers:all-MiniLM-L6-v2
# Hybrid search
ennoia search "employer duty to accommodate disability" \
--schema my_schemas.py \
--store ./my_index \
--filter "jurisdiction=WA" \
--filter "date_decided__gte=2020-01-01" \
--top-k 5
See docs/cli.md.
Documentation
License
Apache 2.0. See LICENSE.txt and NOTICE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ennoia-0.1.0.tar.gz.
File metadata
- Download URL: ennoia-0.1.0.tar.gz
- Upload date:
- Size: 171.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87777d43edb60aeb4928a79867fb02dad96d0f2c0cc994aa1452e68ebc38f0ed
|
|
| MD5 |
96738cee5092698b87c960a90f7aac2a
|
|
| BLAKE2b-256 |
f988d65899fa1711996c6954ad74fa63696f7f72253c4804f99d54de33be3386
|
Provenance
The following attestation bundles were made for ennoia-0.1.0.tar.gz:
Publisher:
release.yml on vunone/ennoia
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ennoia-0.1.0.tar.gz -
Subject digest:
87777d43edb60aeb4928a79867fb02dad96d0f2c0cc994aa1452e68ebc38f0ed - Sigstore transparency entry: 1312979454
- Sigstore integration time:
-
Permalink:
vunone/ennoia@ac8c0e0e2477aa8c305094b9aad0fb387d30be80 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/vunone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ac8c0e0e2477aa8c305094b9aad0fb387d30be80 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ennoia-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ennoia-0.1.0-py3-none-any.whl
- Upload date:
- Size: 51.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aab76c49b917233070ccb20a5ac743c346b70efae5884fd9ec3af8321cc0a32b
|
|
| MD5 |
c224cc19067d88cbc4fcb679c5829d07
|
|
| BLAKE2b-256 |
69ad511c839d13e22ee2288160cf88f257cb2f38e6fd42b4f94a5378bec2a0ab
|
Provenance
The following attestation bundles were made for ennoia-0.1.0-py3-none-any.whl:
Publisher:
release.yml on vunone/ennoia
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ennoia-0.1.0-py3-none-any.whl -
Subject digest:
aab76c49b917233070ccb20a5ac743c346b70efae5884fd9ec3af8321cc0a32b - Sigstore transparency entry: 1312979749
- Sigstore integration time:
-
Permalink:
vunone/ennoia@ac8c0e0e2477aa8c305094b9aad0fb387d30be80 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/vunone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ac8c0e0e2477aa8c305094b9aad0fb387d30be80 -
Trigger Event:
push
-
Statement type: