Skip to main content

Persistent memory + emergent identity engine for any LLM

Project description

kore-mind

Persistent memory + emergent identity engine for any LLM.

One file = one mind. SQLite-based. Zero config. Zero external dependencies. Runtime-agnostic.

Part of kore-stack — the complete cognitive middleware for LLMs. pip install kore-stack for the full stack, or install individually:

Install

pip install kore-mind          # just the memory engine
pip install kore-stack         # full stack: mind + bridge + SC routing

Usage

from kore_mind import Mind

mind = Mind("agent.db")

# Register experiences
mind.experience("User works on complexity theory proofs")
mind.experience("User prefers direct, concise answers")

# Recall relevant memories
memories = mind.recall("proof techniques")

# Reflect: decay old memories, consolidate, update identity
identity = mind.reflect()
print(identity.summary)

# Forget: explicit pruning
mind.forget(threshold=0.1)

Core concepts

  • Memory has a lifecycle: salience decays over time. Unused memories fade. Accessed memories strengthen.
  • Identity is emergent: not configured, but computed from accumulated memories.
  • reflect() is the key operation: decay + consolidation + identity update.

API

Method Description
experience(text) Something happened. Record it.
recall(query) What's relevant now?
reflect(fn) Consolidate. Decay. Evolve.
identity() Who am I now?
forget(threshold) Explicit pruning.
scoped(source) Filtered view per user. Same DB.
traces() Query operation traces.

Semantic Search (v0.3)

Built-in embedding providers — semantic recall works with one line:

from kore_mind import Mind, numpy_embed

# Zero-dependency option (numpy only, no external service)
mind = Mind("agent.db", embed_fn=numpy_embed())

mind.experience("me gusta el café por la mañana")
mind.experience("Python es un lenguaje de programación")

# Finds "café" even searching for "bebidas calientes"
results = mind.recall("bebidas calientes")

Three providers available:

from kore_mind.embeddings import numpy_embed, ollama_embed, openai_embed

# 1. numpy_embed — zero dependencies, deterministic, fast
mind = Mind("agent.db", embed_fn=numpy_embed())

# 2. ollama_embed — local Ollama server (falls back to numpy if unavailable)
mind = Mind("agent.db", embed_fn=ollama_embed())

# 3. openai_embed — cloud, max quality (requires API key)
mind = Mind("agent.db", embed_fn=openai_embed(api_key="sk-..."))

Embedding improvements (v0.4.0)

ollama_embed() now includes persistent cache, float16 quantization, async support, and streaming batch:

from kore_mind import ollama_embed, ollama_embed_async
from kore_mind.storage import Storage

# Persistent L2 cache (survives restarts)
store = Storage("embeddings.db")
embed = ollama_embed(model="nomic-embed-text", storage=store)

# Single embeddings — L1 (memory) + L2 (SQLite) cache
vec = embed("some text")

# Batch embedding — one HTTP call for all uncached texts
vectors = embed.batch(["text one", "text two", "text three"])

# Float16 quantization (half the memory: 768*2=1.5KB vs 768*4=3KB)
embed_q = ollama_embed(quantize=True)

# Streaming batch — yield results as each mini-batch completes
for chunk in embed.stream_batch(texts, chunk_size=8):
    process(chunk)  # don't wait for all 200 texts

# Async variant
embed_async = ollama_embed_async(model="nomic-embed-text")
result = await embed_async("hello world")
results = await embed_async.batch(["a", "b", "c"])

# Async streaming with concurrency control
async for chunk in embed.astream_batch(texts, chunk_size=8, concurrency=2):
    await process(chunk)

cosine_similarity() now validates dimensions (raises ValueError on mismatch) and auto-detects float32/float16 encoding.

Ollama optimizations (v0.3.1)

Connection reuse, LRU cache, batch embedding, and fallback warnings.

v0.2 Features

Per-user filtering

Each user gets their own "mind" — same database, different context.

# Option 1: default source
mind = Mind("agent.db", default_source="carlos")
mind.experience("Likes Python")  # automatically tagged to carlos
mind.recall("Python")            # only carlos's memories

# Option 2: scoped view
alice = mind.scoped("alice")
alice.experience("Prefers Rust")
alice.recall()  # only alice's memories

Observability

Full tracing of every operation. Zero overhead when disabled (default).

mind = Mind("agent.db", enable_traces=True)

mind.experience("Something happened")
mind.recall("what happened")

# Query traces
traces = mind.traces(operation="recall")
for t in traces:
    print(f"{t.operation} took {t.duration_ms:.1f}ms")

# Filter by source
traces = mind.traces(source="carlos", limit=50)

Smart Cache (storage layer)

Hash-based cache with TTL, per-user isolation, and hit counting. Used by kore-bridge for token savings.

from kore_mind.models import CacheEntry

entry = CacheEntry(
    query="What is P vs NP?",
    response="It's an open problem...",
    query_hash="a1b2c3d4",
    source="carlos",
    ttl=3600.0,
)
mind._storage.save_cache_entry(entry)
found = mind._storage.find_cache_by_hash("a1b2c3d4", source="carlos")

Rate Limiting (storage layer)

Query logging with temporal window counting. Used by kore-bridge for cognitive rate limiting.

Models

Model Description
Memory A memory with lifecycle (salience, decay, tags, embedding)
Identity Emergent identity (traits, summary, relationships)
MemoryType episodic, semantic, procedural
Trace Operation trace (operation, duration, source, metadata)
CacheEntry Cache entry (query, response, hash, TTL, hit count)

Backward compatibility

All new parameters have defaults that preserve v0.1 behavior:

# This works exactly the same as v0.1
mind = Mind("agent.db")
mind.experience("fact")
mind.recall("query")

Part of kore-stack

Package What it does
kore-mind (this) Memory, identity, traces, cache storage
kore-bridge LLM integration, cache logic, rate limiting, A/B testing
sc-router Query routing by Selector Complexity theory
kore-stack All of the above, one install: pip install kore-stack

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kore_mind-0.5.0.tar.gz (29.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kore_mind-0.5.0-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file kore_mind-0.5.0.tar.gz.

File metadata

  • Download URL: kore_mind-0.5.0.tar.gz
  • Upload date:
  • Size: 29.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kore_mind-0.5.0.tar.gz
Algorithm Hash digest
SHA256 1b436376a73aadd22a17eeeb0f8f5532c62b9d2a3c76fa15001aef44a8629c60
MD5 f56682457e135c880c6cf64704911875
BLAKE2b-256 88e1e4a2dc41ed8f96eff8e23a54e59a07b329cc7c1c83583a376b56523addeb

See more details on using hashes here.

File details

Details for the file kore_mind-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: kore_mind-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 20.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kore_mind-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ea3ba3f0d1077e45a21ed7f98cd2d031418c9bba77d51aa934651648ff378ca6
MD5 0d8ee9177e134b590c692a1b82638e3f
BLAKE2b-256 a25aedb0d97d425f689ee910e903397ec62280d80cd7373650eb4636c75f0639

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page