Deterministic Knowledge Graph & Vector Engine with Bit-Exact Audit Trails

These details have not been verified by PyPI

Project links

Project description

Valoricore 🛡️

The Official Python SDK for Valori-Kernel

Deterministic Vector Memory · Cryptographic Audit Trails · Hybrid Knowledge Graphs

valoricore is the official Python SDK for Valori-Kernel — a no_std Rust engine that unifies Vector Memory and Knowledge Graphs into a single, cryptographically auditable memory space.

Every insert, search, and graph edge is backed by fixed-point Q16.16 arithmetic, producing bit-identical results across x86, ARM, and RISC-V. The global state is always summarised in a single BLAKE3 Merkle root you can store, compare, and prove.

✨ What Makes Valoricore Different?

Feature	Valoricore	Chroma / FAISS / Pinecone
Results across hardware	✅ Bit-identical (Q16.16 fixed-point)	❌ Float drift
Cryptographic state proof	✅ BLAKE3 Merkle root per operation	❌ None
Hybrid Vector + Graph	✅ Native, same memory space	⚠️ Graph is separate system
Offline proof verification	✅ No DB connection required	❌ N/A
Snapshot / replay	✅ Byte-exact restore	⚠️ Partial / format-specific
`no_std` embeddable core	✅ Zero heap allocation in kernel	❌ Heap-heavy
Air-gapped deployment	✅ Local FFI, no cloud required	⚠️ Varies

📦 Installation

Valoricore ships with pre-compiled Rust binaries for Linux (x86-64, arm64), macOS (x86-64, Apple Silicon), and Windows. A Rust compiler is only required when building from source.

Core (vector DB + knowledge graph)

pip install valoricore

With local / offline embeddings (no API key needed)

pip install "valoricore[local]"
# Uses sentence-transformers + PyTorch

With cloud embedding providers

pip install "valoricore[openai]"    # OpenAI text-embedding-3-*
pip install "valoricore[cohere]"    # Cohere embed-english-v3.0

Full installation (all providers + LangChain + LlamaIndex)

pip install "valoricore[all]"

Optional integrations

pip install "valoricore[langchain]"    # LangChain VectorStore + Retriever
pip install "valoricore[llamaindex]"   # LlamaIndex VectorStore
pip install "valoricore[pdf]"          # PDF document ingestion (pypdf)

🚀 Quick Start

1 · Embedded Local Engine (no server required)

from valoricore import MemoryClient
from valoricore.embeddings import SentenceTransformerEmbedder

# ① Load a local model (downloads once, runs fully offline after that)
embedder = SentenceTransformerEmbedder("all-MiniLM-L6-v2")   # dim=384

# ② Initialize the embedded Rust engine
client = MemoryClient(path="./my_valori_db")

# ③ Add a document — automatically chunks, embeds, and links in the Knowledge Graph
result = client.add_document(
    text  = "Valoricore is a deterministic, no_std Rust kernel "
            "that unifies vector memory and knowledge graphs.",
    embed = embedder,
    title = "Introduction",
)
print(f"Document Node ID : {result['document_node_id']}")
print(f"Chunk count      : {result['chunk_count']}")
print(f"Proof hashes     : {result['proof_hashes']}")

# ④ Semantic search
hits = client.semantic_search("What does Valoricore unify?", embed=embedder, k=3)
for h in hits:
    print(f"  id={h['id']}  l2_score={h['score']}")

# ⑤ Cryptographic state proof
print(f"\nDatabase state : {client.get_state_hash()}")

2 · Remote / Cluster Mode

Connect to a standalone valori-node HTTP server and use the exact same API:

from valoricore import MemoryClient
from valoricore.embeddings import OpenAIEmbedder

embedder = OpenAIEmbedder()          # reads OPENAI_API_KEY from env

# Simply pass a remote URL — everything else is identical
client = MemoryClient(remote="http://my-valori-node:3000")

result = client.add_document(
    text  = "Remote deployment with full audit trail.",
    embed = embedder,
)
print(result["document_node_id"])

# Snapshot the remote node state to local bytes
snap = client.snapshot()
with open("backup.snap", "wb") as f:
    f.write(snap)

3 · Async API (FastAPI / asyncio)

import asyncio
from valoricore import AsyncMemoryClient
from valoricore.embeddings import SentenceTransformerEmbedder

embedder = SentenceTransformerEmbedder("all-MiniLM-L6-v2")

async def main():
    # Async context manager – auto-closes on exit
    async with AsyncMemoryClient(path="./async_db") as client:

        result = await client.add_document(
            text  = "Non-blocking deterministic vector storage.",
            embed = embedder,
        )
        print(f"node_id={result['document_node_id']}")

        hits = await client.semantic_search(
            "Non-blocking search", embed=embedder, k=5
        )
        print(f"Found {len(hits)} results")

        # Snapshot + audit from async context
        snap  = await client.snapshot()
        state = await client.get_state_hash()
        print(f"State: {state}")

asyncio.run(main())

🔌 Embedding Providers

The valoricore.embeddings module provides production-ready adapters for every major embedding provider. Every adapter implements __call__ so it works directly wherever an EmbedFn is accepted.

Provider Overview

Provider	Class	Offline?	Extra install
SentenceTransformers	`SentenceTransformerEmbedder`	✅ Yes	`pip install "valoricore[local]"`
OpenAI	`OpenAIEmbedder`	❌ Cloud	`pip install "valoricore[openai]"`
Cohere	`CohereEmbedder`	❌ Cloud	`pip install "valoricore[cohere]"`
HuggingFace Inference	`HuggingFaceEmbedder`	❌ Cloud	(requests, built-in)
Ollama	`OllamaEmbedder`	✅ Local server	`ollama pull nomic-embed-text`
Dummy / CI	`DummyEmbedder`	✅ Yes	(built-in)
Hash / CI	`HashEmbedder`	✅ Yes	(built-in)

Local / Offline Production (Recommended for Air-Gapped Environments)

from valoricore.embeddings import SentenceTransformerEmbedder, CachedEmbedder

# High-quality model, fully offline after first download
raw_embedder = SentenceTransformerEmbedder(
    model_name = "BAAI/bge-small-en-v1.5",   # dim=384, state-of-the-art
    device     = "cpu",                        # or "cuda", "mps"
    normalize  = True,                         # cosine similarity friendly
)

# Optional: wrap with LRU cache to avoid re-embedding identical texts
embedder = CachedEmbedder(raw_embedder, max_size=5000)

OpenAI (Cloud)

import os
from valoricore.embeddings import OpenAIEmbedder

embedder = OpenAIEmbedder(
    api_key    = os.environ["OPENAI_API_KEY"],   # or pass directly
    model      = "text-embedding-3-small",        # dim=1536
    dimensions = 384,                            # optional truncation (3-* models only)
)

Ollama (Local Server — Zero Cloud Dependency)

# One-time setup
brew install ollama && ollama serve
ollama pull nomic-embed-text        # dim=768

from valoricore.embeddings import OllamaEmbedder

embedder = OllamaEmbedder(
    model    = "nomic-embed-text",
    base_url = "http://localhost:11434",
)

Convenience Factory

from valoricore.embeddings import get_embedder

# Swap providers with one line change
embedder = get_embedder("local",       model_name="all-MiniLM-L6-v2")
embedder = get_embedder("openai",      api_key="sk-...")
embedder = get_embedder("ollama",      model="nomic-embed-text")
embedder = get_embedder("cohere",      api_key="...")
embedder = get_embedder("huggingface", api_key="hf_...", model="sentence-transformers/all-MiniLM-L6-v2")
embedder = get_embedder("dummy",       dim=384)   # CI / tests

Async Embedder (for asyncio pipelines)

from valoricore.embeddings import SentenceTransformerEmbedder, AsyncEmbedder

sync_embedder  = SentenceTransformerEmbedder("all-MiniLM-L6-v2")
async_embedder = AsyncEmbedder(sync_embedder)   # runs in thread-pool

async def pipeline():
    vec  = await async_embedder.embed("Hello")
    vecs = await async_embedder.embed_batch(["Hello", "World"])

🧠 Core Concepts

Records

A Record is a dense fixed-point vector stored in the kernel's RecordPool. Every insert returns an integer record_id and a BLAKE3 Merkle proof.

Nodes & Edges (Knowledge Graph)

A Node is a named entity that optionally points to a Record. An Edge is a directed relationship between two Nodes. The graph is stored in the same memory space as the vector pool — no separate database.

Node Kinds (built-in constants)

from valoricore import (
    NODE_RECORD,    # 0 – raw vector record
    NODE_CONCEPT,   # 1 – abstract concept
    NODE_AGENT,     # 2 – AI agent / process
    NODE_USER,      # 3 – human user
    NODE_TOOL,      # 4 – tool or function
    NODE_DOCUMENT,  # 5 – top-level document
    NODE_CHUNK,     # 6 – text chunk (child of document)
)

Edge Kinds (built-in constants)

from valoricore import (
    EDGE_RELATION,    # 0 – generic relation
    EDGE_FOLLOWS,     # 1 – sequential ordering
    EDGE_IN_EPISODE,  # 2 – membership in episode
    EDGE_BY_AGENT,    # 3 – created/sent by agent
    EDGE_MENTIONS,    # 4 – entity mention
    EDGE_REFERS_TO,   # 5 – cross-reference
    EDGE_PARENT_OF,   # 6 – hierarchical parent→child
)

📖 Step-by-Step Usage Guide

Step 1 — Install & Verify

pip install "valoricore[local]"
python -c "import valoricore; print(valoricore.__version__)"

Step 2 — Choose Your Embedding Provider

from valoricore.embeddings import get_embedder

# Local (no API key, no internet after first download)
embedder = get_embedder("local", model_name="all-MiniLM-L6-v2")

# OpenAI
# embedder = get_embedder("openai")   # reads OPENAI_API_KEY env var

# CI / testing (zero-cost, deterministic)
# embedder = get_embedder("dummy", dim=384)

Step 3 — Initialize the Client

from valoricore import MemoryClient

# Local embedded engine (no server needed)
client = MemoryClient(path="./my_db")

# OR connect to a remote cluster
# client = MemoryClient(remote="http://my-node:3000")

Step 4 — Ingest Documents

# From a string
result = client.add_document(
    text       = open("my_paper.txt").read(),
    embed      = embedder,
    title      = "My Paper",
    chunk_size = 512,         # chars per chunk
)

# From a PDF file (requires: pip install "valoricore[pdf]")
from valoricore import load_text_from_file
text   = load_text_from_file("report.pdf")
result = client.add_document(text=text, embed=embedder)

# Insert a raw pre-computed vector
result = client.upsert_vector(vector=[0.1, 0.2, ...])  # len must match kernel dim

Step 5 — Semantic Search

hits = client.semantic_search(
    query = "What is deterministic AI memory?",
    embed = embedder,
    k     = 10,
)

for hit in hits:
    print(f"Record ID : {hit['id']}")
    print(f"L2 Score  : {hit['score']}")   # lower = closer (L2 squared)

Step 6 — Knowledge Graph Operations

from valoricore import NODE_AGENT, NODE_DOCUMENT, EDGE_BY_AGENT

# Manual graph construction
record_id  = client._db.insert([0.5] * 384)
agent_node = client.create_node(kind=NODE_AGENT)
doc_node   = client.create_node(kind=NODE_DOCUMENT, record_id=record_id)

# Link agent → document
client.create_edge(from_id=agent_node, to_id=doc_node, kind=EDGE_BY_AGENT)

# Inspect
print(client.get_node(doc_node))       # {"kind": 5, "record_id": 0}
print(client.get_edges(agent_node))    # [{"edge_id": 0, "to_node": 1, "kind": 3}]

# Traversal: BFS up to depth 2
visited_nodes = client.walk(agent_node, max_depth=2)

# Collect all record_ids reachable from a starting node
record_ids = client.expand(agent_node, max_depth=2)

Step 7 — Lifecycle (Delete, Soft Delete)

# Permanently remove record from pool and search index
client.delete(record_id=0)

# Soft delete: deactivates record but preserves pool slot
client.soft_delete(record_id=1)

# Count active records
n = client.record_count()
print(f"Active records: {n}")

Step 8 — Snapshot, Restore, and Audit

# Snapshot the full kernel state to bytes
snap = client.snapshot()
with open("state.snap", "wb") as f:
    f.write(snap)

# Restore to a fresh engine (bit-exact)
fresh = MemoryClient(path="./restored_db")
fresh.restore(snap)

# The state hashes must be identical
assert fresh.get_state_hash() == client.get_state_hash()
print("✅ Bit-exact restore verified")

# View full event timeline
for event in client.get_timeline():
    print(event)

Step 9 — Cryptographic Proof Verification (Offline)

from valoricore import ingest_embedding, generate_proof, verify_embedding

my_vector = [0.1] * 384

# Generate a standalone proof for this vector (no DB connection required)
fixed_values = ingest_embedding(my_vector)   # float → Q16.16
proof_hex    = generate_proof(fixed_values)  # BLAKE3 Merkle node

# Verify offline — proves the vector has not been tampered with
is_valid = verify_embedding(floats=my_vector, claimed_hash=proof_hex)
print(f"Proof valid: {is_valid}")

🔗 Framework Integrations

LangChain

pip install "valoricore[langchain]"

from langchain_openai import OpenAIEmbeddings
from valoricore.adapters import ValoricoreAdapter, LangChainVectorStore

adapter     = ValoricoreAdapter(base_url="http://localhost:3000")
embeddings  = OpenAIEmbeddings()

vectorstore = LangChainVectorStore(adapter=adapter, embedding=embeddings)

# Add documents
vectorstore.add_texts(
    texts     = ["Valoricore is deterministic.", "Fixed-point arithmetic rocks."],
    metadatas = [{"source": "intro"}, {"source": "math"}],
)

# Search
docs = vectorstore.similarity_search("What is deterministic AI?", k=3)
for doc in docs:
    print(doc.page_content)

# With scores
docs_scores = vectorstore.similarity_search_with_score("deterministic", k=3)
for doc, score in docs_scores:
    print(f"{doc.page_content[:60]}…  score={score:.4f}")

LangChain Retriever:

from valoricore.adapters import ValoricoreAdapter, LangChainRetriever

adapter   = ValoricoreAdapter(base_url="http://localhost:3000")
retriever = LangChainRetriever(
    adapter  = adapter,
    embed_fn = lambda t: embeddings.embed_query(t),
    k        = 5,
)

docs = retriever.get_relevant_documents("deterministic vector search")

LlamaIndex

pip install "valoricore[llamaindex]"

from llama_index.core import VectorStoreIndex, StorageContext
from valoricore.adapters import ValoricoreAdapter, LlamaIndexVectorStore

adapter      = ValoricoreAdapter(base_url="http://localhost:3000")
vector_store = LlamaIndexVectorStore(adapter=adapter)

storage_ctx  = StorageContext.from_defaults(vector_store=vector_store)
index        = VectorStoreIndex.from_documents(documents, storage_context=storage_ctx)

query_engine = index.as_query_engine()
response     = query_engine.query("What is Valoricore?")
print(response)

🔐 Error Handling

from valoricore import (
    MemoryClient,
    ValoricoreError,   # base – catch all SDK errors
    ValidationError,   # bad vector dimension / FXP out-of-range
    ConnectionError,   # remote node unreachable
    IntegrityError,    # BLAKE3 proof mismatch
    NotFoundError,     # record / node / edge doesn't exist
    KernelError,       # unrecoverable Rust kernel error
)

client = MemoryClient(path="./db")

try:
    client.delete(record_id=9999)
except NotFoundError:
    print("Record does not exist — safe to ignore")

try:
    client.upsert_vector([0.1] * 128)   # wrong dimension
except ValidationError as e:
    print(f"Bad embedding: {e}")

try:
    remote = MemoryClient(remote="http://offline-node:3000")
    remote.snapshot()
except ConnectionError as e:
    print(f"Node unreachable: {e}")

📊 Performance Characteristics

Valoricore enforces deterministic L2 brute-force scanning to guarantee auditability.

Operation	Local FFI	Remote HTTP
Single insert	~20 µs	~0.5 ms
Batch insert (1k vectors)	~15 ms	~50 ms
L2 search (10k×384)	~8 ms	~10 ms
L2 search (100k×384)	~80 ms	~90 ms
Graph BFS (depth 2, 50 nodes)	~0.5 ms	~2 ms
State hash (BLAKE3)	<1 µs	~1 ms
Snapshot (10k records)	~5 ms	~20 ms

Benchmarked on Apple M2. The local FFI path calls Rust directly with zero serialization overhead.

[!NOTE] Valoricore uses Q16.16 fixed-point arithmetic. Safe input range for embedding values is [-32767.0, 32767.0]. Standard normalized embeddings (OpenAI, SentenceTransformers) are always in [-1.0, 1.0] and are therefore safe.

⚙️ Configuration Reference

`MemoryClient` / `AsyncMemoryClient`

Parameter	Type	Default	Description
`path`	`str`	`"./valori_db"`	Local database directory
`remote`	`str \| None`	`None`	Remote node URL. When set, `path` is ignored
`index_kind`	`str`	`"bruteforce"`	Future: `"hnsw"` / `"ivf"`
`quantization`	`str`	`"none"`	Future: `"int8"` / `"binary"`

`Valoricore` / `AsyncValoricore` (factories)

from valoricore import Valoricore, AsyncValoricore

db       = Valoricore(path="./db")                        # local
db       = Valoricore(remote="http://node:3000")          # remote

async_db = AsyncValoricore(path="./db")                   # local async
async_db = AsyncValoricore(remote="http://node:3000")     # remote async

🛠 Forensic CLI

The valori CLI lets you inspect the append-only event log and reproduce the exact state of any historical snapshot.

# Install CLI (included with the package)
pip install valoricore

# Deep forensic inspection
valori inspect --dir ./my_valori_db --snapshot-path ./my_valori_db/state.snap

# View chronological event timeline
valori timeline ./my_valori_db/events.log

# Verify a snapshot's state hash
valori verify --snapshot ./my_valori_db/state.snap --expected-hash <64-char-hex>

🗂 Project Structure

valoricore/
├── __init__.py                  # Public API surface
├── embeddings.py                # 🆕 Embedding provider adapters
├── factory.py                   # Valoricore() / AsyncValoricore() factories
├── local.py                     # LocalClient (FFI)
├── remote.py                    # SyncRemoteClient / AsyncRemoteClient
├── memory.py                    # MemoryClient (high-level)
├── async_memory.py              # AsyncMemoryClient (full async mirror)
├── protocol.py                  # ProtocolClient (unified local+remote)
├── adapter.py                   # ValoricoreAdapter (proof overlay)
├── chunking.py                  # Deterministic text chunkers
├── ingest.py                    # File loaders (.txt, .md, .pdf)
├── kinds.py                     # Node / Edge kind constants
├── types.py                     # Type aliases (Vector, Proof, etc.)
├── exceptions.py                # Exception hierarchy
├── utils.py                     # Internal helpers
└── adapters/                    # Framework adapters (optional)
    ├── base.py                  # ValoricoreAdapter (retry + validation)
    ├── langchain.py             # LangChain Retriever
    ├── langchain_vectorstore.py # LangChain VectorStore
    ├── llamaindex.py            # LlamaIndex VectorStore
    └── sentence_transformers_adapter.py

📚 Documentation

Resource	Description
Getting Started Guide	First 5 minutes walkthrough
API Reference	Complete method signatures and return types
Architecture	Rust kernel internals and design decisions

🤝 Contributing

# Clone and install for development
git clone https://github.com/varshith-Git/Valori-Kernel
cd Valori-Kernel/python
pip install -e ".[dev]"

# Build the Rust FFI extension
cd ..
maturin develop

# Run tests
pytest python/tests/ -v

📄 License

Licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
See LICENSE for details.

Built with ❤️ by the Valoricore team
Integrity-First AI Infrastructure

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Jun 5, 2026

0.1.0

Jun 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

valoricore-0.1.2.tar.gz (1.5 MB view details)

Uploaded Jun 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

valoricore-0.1.2-cp39-abi3-macosx_11_0_arm64.whl (529.7 kB view details)

Uploaded Jun 5, 2026 CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file valoricore-0.1.2.tar.gz.

File metadata

Download URL: valoricore-0.1.2.tar.gz
Upload date: Jun 5, 2026
Size: 1.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.10.2

File hashes

Hashes for valoricore-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`bc6c5d466c33be04f131ef5ad32ba80fe421ad277e10df237744015003978144`
MD5	`f4c17269acf4b12071463d54fc902ddb`
BLAKE2b-256	`3e6799ee21f95e3b61a0f4e09839226bfec24a84c830772ccd91990d5100905e`

See more details on using hashes here.

File details

Details for the file valoricore-0.1.2-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: valoricore-0.1.2-cp39-abi3-macosx_11_0_arm64.whl
Upload date: Jun 5, 2026
Size: 529.7 kB
Tags: CPython 3.9+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.10.2

File hashes

Hashes for valoricore-0.1.2-cp39-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`fd429051d814fca2d722afd757ee2bfd062008fe71dae2d3386ee39d6d0d30f3`
MD5	`3441d112ce314c3c03633b6ec6bc6de3`
BLAKE2b-256	`5fa78e2e9439a07f578759b0dad791f6225f9ffb52525d38ac712eb69f530e05`

See more details on using hashes here.

valoricore 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Valoricore 🛡️

The Official Python SDK for Valori-Kernel

✨ What Makes Valoricore Different?

📦 Installation

Core (vector DB + knowledge graph)

With local / offline embeddings (no API key needed)

With cloud embedding providers

Full installation (all providers + LangChain + LlamaIndex)

Optional integrations

🚀 Quick Start

1 · Embedded Local Engine (no server required)

2 · Remote / Cluster Mode

3 · Async API (FastAPI / asyncio)

🔌 Embedding Providers

Provider Overview

Local / Offline Production (Recommended for Air-Gapped Environments)

OpenAI (Cloud)

Ollama (Local Server — Zero Cloud Dependency)

Convenience Factory

Async Embedder (for asyncio pipelines)

🧠 Core Concepts

Records

Nodes & Edges (Knowledge Graph)

Node Kinds (built-in constants)

Edge Kinds (built-in constants)

📖 Step-by-Step Usage Guide

Step 1 — Install & Verify

Step 2 — Choose Your Embedding Provider

Step 3 — Initialize the Client

Step 4 — Ingest Documents

Step 5 — Semantic Search

Step 6 — Knowledge Graph Operations

Step 7 — Lifecycle (Delete, Soft Delete)

Step 8 — Snapshot, Restore, and Audit

Step 9 — Cryptographic Proof Verification (Offline)

🔗 Framework Integrations

LangChain

LlamaIndex

🔐 Error Handling

📊 Performance Characteristics

⚙️ Configuration Reference

MemoryClient / AsyncMemoryClient

Valoricore / AsyncValoricore (factories)

🛠 Forensic CLI

🗂 Project Structure

📚 Documentation

🤝 Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`MemoryClient` / `AsyncMemoryClient`

`Valoricore` / `AsyncValoricore` (factories)