Skip to main content

A framework for LLM-powered document pre-indexing and hybrid retrieval.

Project description

Ennoia

CI coverage PyPI Python License types: pyright strict Ruff

A framework for LLM-powered document pre-indexing and hybrid retrieval.

Ennoia treats indexing as a first-class problem. Instead of embedding raw text and hoping vector similarity recovers relevance, you declare extraction schemas (Pydantic models for structured metadata, marker classes for semantic summaries), and Ennoia runs an LLM-driven DAG over each document to produce rich, filterable indices. At query time, retrieval runs in two phases: structured filters narrow the candidate set, then vector search ranks within it.

Install

pip install "ennoia[ollama,sentence-transformers,cli]"

Available extras: ollama, openai, anthropic, sentence-transformers, filesystem (Parquet + NumPy stores), cli (ennoia CLI), all (everything above).

Quick start (SDK)

from datetime import date
from typing import Literal

from ennoia import BaseSemantic, BaseStructure, Pipeline, Store
from ennoia.adapters.embedding.sentence_transformers import SentenceTransformerEmbedding
from ennoia.adapters.llm.ollama import OllamaAdapter
from ennoia.store import InMemoryStructuredStore, InMemoryVectorStore


class DocMeta(BaseStructure):
    """Extract basic document metadata."""

    category: Literal["legal", "medical", "financial"]
    doc_date: date


class Summary(BaseSemantic):
    """What is the main topic of this document?"""


pipeline = Pipeline(
    schemas=[DocMeta, Summary],
    store=Store(vector=InMemoryVectorStore(), structured=InMemoryStructuredStore()),
    llm=OllamaAdapter(model="qwen3:0.6b"),
    embedding=SentenceTransformerEmbedding(model="all-MiniLM-L6-v2"),
)

pipeline.index(text="The court held that...", source_id="doc_001")
results = pipeline.search(
    query="court holdings on liability",
    filters={"category": "legal"},
    top_k=5,
)

See docs/quickstart.md for the full walkthrough.

Quick start (CLI)

# Iterate on a schema against a single document
ennoia try ./sample.txt --schema my_schemas.py

# Index a folder into a filesystem-backed store
ennoia index ./docs \
  --schema my_schemas.py \
  --store ./my_index \
  --llm ollama:qwen3:0.6b \
  --embedding sentence-transformers:all-MiniLM-L6-v2

# Hybrid search
ennoia search "employer duty to accommodate disability" \
  --schema my_schemas.py \
  --store ./my_index \
  --filter "jurisdiction=WA" \
  --filter "date_decided__gte=2020-01-01" \
  --top-k 5

See docs/cli.md.

Documentation

License

Apache 2.0. See LICENSE.txt and NOTICE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ennoia-0.1.0.tar.gz (171.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ennoia-0.1.0-py3-none-any.whl (51.9 kB view details)

Uploaded Python 3

File details

Details for the file ennoia-0.1.0.tar.gz.

File metadata

  • Download URL: ennoia-0.1.0.tar.gz
  • Upload date:
  • Size: 171.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ennoia-0.1.0.tar.gz
Algorithm Hash digest
SHA256 87777d43edb60aeb4928a79867fb02dad96d0f2c0cc994aa1452e68ebc38f0ed
MD5 96738cee5092698b87c960a90f7aac2a
BLAKE2b-256 f988d65899fa1711996c6954ad74fa63696f7f72253c4804f99d54de33be3386

See more details on using hashes here.

Provenance

The following attestation bundles were made for ennoia-0.1.0.tar.gz:

Publisher: release.yml on vunone/ennoia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ennoia-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ennoia-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 51.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ennoia-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aab76c49b917233070ccb20a5ac743c346b70efae5884fd9ec3af8321cc0a32b
MD5 c224cc19067d88cbc4fcb679c5829d07
BLAKE2b-256 69ad511c839d13e22ee2288160cf88f257cb2f38e6fd42b4f94a5378bec2a0ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for ennoia-0.1.0-py3-none-any.whl:

Publisher: release.yml on vunone/ennoia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page