GMS-powered document ingestion, query, and learning — DocGMS
Project description
knowlytix-knowledge
Geometric expert system with LLM-augmented learning. Ingest documents, query a geometric memory back-end, verify LLM answers against provable graph traversals, and grow the store over time.
knowlytix-knowledge is one of four packages in the Geometric Memory Systems
family. It pairs a structured geometric knowledge store (knowlytix-core) with LLM
reasoning, and writes verified outputs back into the store so the expert
system improves with use.
- Package:
knowlytix-knowledge - License: Apache-2.0
- Python: 3.12+
- Status: alpha (v0.x)
Install
pip install knowlytix-knowledge
knowlytix-knowledge depends on knowlytix-core (pinned ~=0.1.0
under lockstep versioning) and routes every LLM call through
LiteLLM — configure the provider of your choice (Anthropic,
OpenAI, Bedrock, Azure, Ollama, …) via environment variables.
Quickstart
Ingest a document, train the geometric store, and run a query end-to-end. The snippet below uses the smoke-test fixture shipped in the wheel, so no external data is required.
import torch
from importlib.resources import files
from knowlytix.knowledge import (
DocGMSConfig, GMSExpertStore, QueryEngine,
ingest_document,
)
from knowlytix.knowledge.llm_backend import create_backend
# 1. Config + LLM backend (reads GMS_LLM_MODEL + provider key from env).
config = DocGMSConfig(store_path="./my_store")
llm = create_backend(config.convert)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# 2. Fresh store + ingest the bundled sample document.
store = GMSExpertStore(config, device=device)
sample = files("knowlytix.knowledge.fixtures.smoke") / "sample.md"
ingest_document(store, str(sample), llm, config, device)
# 3. Query the expert system.
engine = QueryEngine(store, llm, config)
result = engine.query("Which divisions report to the Chief Operations Officer?")
print(result.answer)
print(f"source={result.source} confidence={result.confidence:.2f}")
The first ingestion trains the geometric knowledge graph from the document's triples. Subsequent ingestions grow the store incrementally.
Configuration
All tuning knobs read from environment variables (12-factor style). Copy
.env.example from the repo and override only what you need.
DOCGMS_* — ingestion and verification
| Variable | Default | Meaning |
|---|---|---|
DOCGMS_MAX_PAGES |
1000 |
Per-document page ceiling for PDF ingestion. |
DOCGMS_CHUNK_SIZE |
2048 |
Token budget per chunk sent to the LLM. |
DOCGMS_N_STEPS |
8 |
Reasoning steps the verifier takes per query. |
DOCGMS_FREEZE_EXISTING |
false |
If true, ingestion never overwrites existing nodes. |
DOCGMS_CONTRADICTION_GATE |
true |
Reject LLM outputs that contradict the current store. |
DOCGMS_AUTO_LEARN |
true |
Persist verified LLM outputs back into the store. |
DOCGMS_STORE_PATH |
./docgms_store |
Default on-disk store location. |
GMS_LLM_* — LLM routing (from knowlytix-core)
| Variable | Meaning |
|---|---|
GMS_LLM_MODEL |
Base LiteLLM model string (e.g. anthropic/claude-opus-4-6, openai/gpt-4o-mini, ollama/llama3). Required unless overridden per-purpose. |
GMS_LLM_MODEL_JUDGE |
Override for verifier / judge calls. |
GMS_LLM_MODEL_SCORER |
Override for scoring. |
GMS_LLM_TIMEOUT_SECONDS |
Per-call timeout. Default 60. |
GMS_LLM_MAX_RETRIES |
Retry count on transient provider errors. Default 2. |
GMS_LLM_TEMPERATURE |
Sampling temperature. Default 0.0. |
Provider API keys
Set exactly one set, matching your GMS_LLM_MODEL:
- Anthropic:
ANTHROPIC_API_KEY - OpenAI:
OPENAI_API_KEY - Bedrock:
AWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEY+AWS_REGION - Azure:
AZURE_API_KEY+AZURE_API_BASE+AZURE_API_VERSION - Ollama:
OLLAMA_BASE_URL(no key — runs locally)
Missing keys surface an actionable LLMConfigError naming the exact env vars
required for your selected model.
Public API
Import from the top-level package (see __all__):
from knowlytix.knowledge import (
ConvertConfig, DocGMSConfig, DocGMSSettings,
GMSExpertStore, IngestResult, QueryEngine, QueryResult,
ingest_document,
)
Anything outside __all__ is internal and may change without notice. The
non-shipped knowlytix.knowledge.mcp_server, knowlytix.knowledge.web_agent,
and knowlytix.knowledge.cli modules live in the source repo but do not
land in the wheel.
Related packages
| Package | Role |
|---|---|
knowlytix-core |
Geometric memory engine (required runtime dep) |
knowlytix-benchmark |
Benchmark harness for structured retrieval |
knowlytix-harness |
DOE-driven black-box testing + runtime governance |
Links
- Source: knowlytix/gms
- Book: Geometric Memory Systems (forthcoming)
- Paper: DocGMS: Geometric Expert Systems with LLM-Augmented Learning
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file knowlytix_knowledge-0.0.2-py3-none-any.whl.
File metadata
- Download URL: knowlytix_knowledge-0.0.2-py3-none-any.whl
- Upload date:
- Size: 4.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
600244a73bc92a3bb7f39e089649c8c10f2998957ab4f8472b3e6df65cb83ec1
|
|
| MD5 |
f044939648286f9dacf6d7e7de120bc4
|
|
| BLAKE2b-256 |
2c43926c5e8f7961cc3abd0e473575503c95b202058c4c25305f5406ebc98a71
|