Skip to main content

HAEMA memory framework built on ChromaDB

Project description

HAEMA

English | 한국어

HAEMA is an agent memory framework built on ChromaDB.

It provides three memory modes through a single write API:

  • core memory: durable high-impact identity/policy/user facts (get_core)
  • latest memory: recency slice by timestamp (get_latest)
  • long-term memory: semantic retrieval (search)

You only write through add(contents), and HAEMA updates all layers automatically.

Key Changes (Current)

  • add(contents) runs a single N:M reconstruction pass per call.
  • Embedding is split into query/document interfaces:
    • embed_query(...)
    • embed_document(...)
  • no-related special path is removed; one reconstruction schema is used.
  • reconstruction schema:
    • memories: list[str]
    • coverage: "complete" | "incomplete"

Installation

pip install haema

Development:

pip install -e ".[dev]"

Quick Start

from haema import EmbeddingClient, LLMClient, Memory


class MyEmbeddingClient(EmbeddingClient):
    ...


class MyLLMClient(LLMClient):
    ...

m = Memory(
    path="./haema_store",               # storage root
    output_dimensionality=1536,         # embedding vector width
    embedding_client=MyEmbeddingClient(),
    llm_client=MyLLMClient(),
    merge_top_k=3,                      # related candidates per input
    merge_distance_cutoff=0.25,         # related-memory distance threshold
)

m.add([
    "The user prefers concise and actionable responses.",
    "The user is building HAEMA on top of ChromaDB.",
])

print(m.get_core())                      # str
print(m.get_latest(begin=1, count=5))   # list[str]
print(m.search("user preference", n=3)) # list[str]

Real provider example:

  • examples/google_genai_example.py

Public API

Constructor

Memory(path, output_dimensionality, embedding_client, llm_client, merge_top_k=3, merge_distance_cutoff=0.25)

  • path: str | Path: storage root directory
  • output_dimensionality: int: required embedding dimension (> 0)
  • embedding_client: EmbeddingClient: query/document embedding adapter
  • llm_client: LLMClient: structured-output adapter
  • merge_top_k: int: related candidate count per new content (default 3, must be > 0)
  • merge_distance_cutoff: float: related-memory distance threshold (default 0.25, must be >= 0)

Validation:

  • output_dimensionality <= 0 -> ValueError
  • merge_top_k <= 0 -> ValueError
  • merge_distance_cutoff < 0 -> ValueError
  • missing chromadb -> ImportError

Methods

  • get_core() -> str: returns full <path>/core.md text.
  • get_latest(begin: int, count: int) -> list[str]: 1-indexed latest slice sorted by descending timestamp.
  • search(content: str, n: int) -> list[str]: semantic search over long-term memory documents.
  • add(contents: str | list[str]) -> None: single write API that updates long-term/latest/core layers.

Method behavior:

  • get_latest(begin < 1) raises ValueError
  • get_latest(count <= 0) returns []
  • search(n <= 0) returns []
  • add(str) runs pre-memory split first; add(list[str]) uses normalized list items directly

How To Implement Adapters

EmbeddingClient

  • embed_query(texts, output_dimensionality) -> np.ndarray
  • embed_document(texts, output_dimensionality) -> np.ndarray

Checklist:

  • return a 2D numpy.ndarray
  • dtype must be float32
  • shape must be (len(texts), output_dimensionality)
  • keep query/document task settings separated when your provider supports it

LLMClient

  • generate_structured(system_prompt, user_prompt, response_model) -> dict[str, Any]

Checklist:

  • return a dict[str, Any] parseable by response_model.model_validate(...)
  • propagate provider failures as exceptions
  • avoid returning unstructured free-form text

Reconstruction Schema

HAEMA uses structured reconstruction output for long-term memory updates:

class MemoryReconstructionResponse(BaseModel):
    memories: list[str]
    coverage: Literal["complete", "incomplete"]

If output is empty or coverage == "incomplete", HAEMA runs one refinement pass. If it still fails, HAEMA safely falls back to normalized contents.

Prompt Contracts (Layer Responsibility)

HAEMA uses three independent prompt stages with separate outputs:

  • pre-memory split:
    • input: one raw add string
    • output schema: PreMemorySplitResponse(contents)
    • responsibility: split factual units only (no core policy decision)
  • reconstruction:
    • input: related memories + new contents
    • output schema: MemoryReconstructionResponse(memories, coverage)
    • responsibility: generate long-term memories only
  • core update:
    • input: current core + reconstructed new memories
    • output schema: CoreUpdateResponse(should_update, core_markdown)
    • responsibility: conservative core update only

Prompt user blocks are boundary-labeled with tags such as:

  • <raw_input> ... </raw_input>
  • <related_memories> ... </related_memories>
  • <new_contents> ... </new_contents>
  • <current_core_markdown> ... </current_core_markdown>
  • <candidate_new_memories> ... </candidate_new_memories>

These tags are prompt-boundary markers for model clarity, not parser/runtime control logic.

Core Memory Policy

Core memory should keep only durable, high-impact, high-confidence information. By prompt policy, candidate items should pass:

  1. durability across sessions
  2. material impact on future agent behavior
  3. high confidence grounded in evidence

Core prompt policy also enforces:

  • strict section routing to one of SOUL/TOOLS/RULE/USER
  • exclusion of temporary/session-only/transient logs and noise
  • compact high-signal output with a soft target budget around 8 bullets total

Storage Layout

Given path="./haema_store":

  • long-term vector DB: ./haema_store/db
  • core markdown: ./haema_store/core.md
  • latest index DB: ./haema_store/latest.sqlite3

Long-term metadata fields:

  • timestamp (UTC ISO8601)
  • timestamp_ms (Unix epoch milliseconds)

How add() Works

  1. Normalize input strings.
    • if contents is a single str, HAEMA first expands it into multiple pre-memory items via structured LLM output
  2. Batch query-embed all contents.
  3. For each query, fetch top-k and keep matches with distance cutoff.
  4. Union related memories by id.
  5. Run one reconstruction call with:
    • related memory documents (may be empty)
    • all new contents
  6. Upsert reconstructed memories with document embeddings.
  7. Delete replaced related IDs only after upsert succeeds.
  8. Update core once per add() call.

Breaking Changes

Compared to older builds:

  1. EmbeddingClient.embed(...) is removed.
  2. NoRelatedMemoryResponse is removed.
  3. MemorySynthesisResponse(update: list[str]) is replaced by MemoryReconstructionResponse.
  4. merge_top_k default changed from 5 to 3.

Documentation

  • docs/index.md
  • docs/usage.md
  • docs/api.md
  • docs/architecture.md
  • docs/release.md

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haema-0.4.0.tar.gz (25.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

haema-0.4.0-py3-none-any.whl (15.1 kB view details)

Uploaded Python 3

File details

Details for the file haema-0.4.0.tar.gz.

File metadata

  • Download URL: haema-0.4.0.tar.gz
  • Upload date:
  • Size: 25.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for haema-0.4.0.tar.gz
Algorithm Hash digest
SHA256 1735d7bacc08881931289e13360355f5772a84b81f56d3c56e461d19a87bf334
MD5 690c1882f99f16bb292f77e949086f5d
BLAKE2b-256 becad8109be3bb33f76598c87a74e8ba3eef34d41ff9b29b3bfc27428109c142

See more details on using hashes here.

Provenance

The following attestation bundles were made for haema-0.4.0.tar.gz:

Publisher: publish-pypi.yml on smturtle2/haema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file haema-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: haema-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 15.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for haema-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c6b65b4b7997547da58554b3e94a0edd6e6ea772b8ddcf76993f1a369ccc796b
MD5 e1dda52f3f673270bb61bd46a597db3e
BLAKE2b-256 599fd49d131e87b249b7c6f59e3b25c9c9a83d14a8691bd768cea7f43e2467fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for haema-0.4.0-py3-none-any.whl:

Publisher: publish-pypi.yml on smturtle2/haema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page