Skip to main content

GMS-powered document ingestion, query, and learning — DocGMS

Project description

knowlytix-knowledge

Geometric expert system with LLM-augmented learning. Ingest documents, query a geometric memory back-end, verify LLM answers against provable graph traversals, and grow the store over time.

knowlytix-knowledge is one of four packages in the Geometric Memory Systems family. It pairs a structured geometric knowledge store (knowlytix-core) with LLM reasoning, and writes verified outputs back into the store so the expert system improves with use.

  • Package: knowlytix-knowledge
  • License: Apache-2.0
  • Python: 3.12+
  • Status: alpha (v0.x)

Install

pip install knowlytix-knowledge

knowlytix-knowledge depends on knowlytix-core (pinned ~=0.1.0 under lockstep versioning) and routes every LLM call through LiteLLM — configure the provider of your choice (Anthropic, OpenAI, Bedrock, Azure, Ollama, …) via environment variables.

Quickstart

Ingest a document, train the geometric store, and run a query end-to-end. The snippet below uses the smoke-test fixture shipped in the wheel, so no external data is required.

import torch
from importlib.resources import files

from knowlytix.knowledge import (
    DocGMSConfig, GMSExpertStore, QueryEngine,
    ingest_document,
)
from knowlytix.knowledge.llm_backend import create_backend

# 1. Config + LLM backend (reads GMS_LLM_MODEL + provider key from env).
config = DocGMSConfig(store_path="./my_store")
llm = create_backend(config.convert)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 2. Fresh store + ingest the bundled sample document.
store = GMSExpertStore(config, device=device)
sample = files("knowlytix.knowledge.fixtures.smoke") / "sample.md"
ingest_document(store, str(sample), llm, config, device)

# 3. Query the expert system.
engine = QueryEngine(store, llm, config)
result = engine.query("Which divisions report to the Chief Operations Officer?")
print(result.answer)
print(f"source={result.source} confidence={result.confidence:.2f}")

The first ingestion trains the geometric knowledge graph from the document's triples. Subsequent ingestions grow the store incrementally.

Configuration

All tuning knobs read from environment variables (12-factor style). Copy .env.example from the repo and override only what you need.

DOCGMS_* — ingestion and verification

Variable Default Meaning
DOCGMS_MAX_PAGES 1000 Per-document page ceiling for PDF ingestion.
DOCGMS_CHUNK_SIZE 2048 Token budget per chunk sent to the LLM.
DOCGMS_N_STEPS 8 Reasoning steps the verifier takes per query.
DOCGMS_FREEZE_EXISTING false If true, ingestion never overwrites existing nodes.
DOCGMS_CONTRADICTION_GATE true Reject LLM outputs that contradict the current store.
DOCGMS_AUTO_LEARN true Persist verified LLM outputs back into the store.
DOCGMS_STORE_PATH ./docgms_store Default on-disk store location.

GMS_LLM_* — LLM routing (from knowlytix-core)

Variable Meaning
GMS_LLM_MODEL Base LiteLLM model string (e.g. anthropic/claude-opus-4-6, openai/gpt-4o-mini, ollama/llama3). Required unless overridden per-purpose.
GMS_LLM_MODEL_JUDGE Override for verifier / judge calls.
GMS_LLM_MODEL_SCORER Override for scoring.
GMS_LLM_TIMEOUT_SECONDS Per-call timeout. Default 60.
GMS_LLM_MAX_RETRIES Retry count on transient provider errors. Default 2.
GMS_LLM_TEMPERATURE Sampling temperature. Default 0.0.

Provider API keys

Set exactly one set, matching your GMS_LLM_MODEL:

  • Anthropic: ANTHROPIC_API_KEY
  • OpenAI: OPENAI_API_KEY
  • Bedrock: AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY + AWS_REGION
  • Azure: AZURE_API_KEY + AZURE_API_BASE + AZURE_API_VERSION
  • Ollama: OLLAMA_BASE_URL (no key — runs locally)

Missing keys surface an actionable LLMConfigError naming the exact env vars required for your selected model.

Public API

Import from the top-level package (see __all__):

from knowlytix.knowledge import (
    ConvertConfig, DocGMSConfig, DocGMSSettings,
    GMSExpertStore, IngestResult, QueryEngine, QueryResult,
    ingest_document,
)

Anything outside __all__ is internal and may change without notice. The non-shipped knowlytix.knowledge.mcp_server, knowlytix.knowledge.web_agent, and knowlytix.knowledge.cli modules live in the source repo but do not land in the wheel.

Related packages

Package Role
knowlytix-core Geometric memory engine (required runtime dep)
knowlytix-benchmark Benchmark harness for structured retrieval
knowlytix-harness DOE-driven black-box testing + runtime governance

Links

  • Source: knowlytix/gms
  • Book: Geometric Memory Systems (forthcoming)
  • Paper: DocGMS: Geometric Expert Systems with LLM-Augmented Learning

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

knowlytix_knowledge-0.0.2-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file knowlytix_knowledge-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for knowlytix_knowledge-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 600244a73bc92a3bb7f39e089649c8c10f2998957ab4f8472b3e6df65cb83ec1
MD5 f044939648286f9dacf6d7e7de120bc4
BLAKE2b-256 2c43926c5e8f7961cc3abd0e473575503c95b202058c4c25305f5406ebc98a71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page