HAEMA memory framework built on ChromaDB
Project description
HAEMA
HAEMA is an agent memory framework built on ChromaDB.
It provides three memory modes through a single write API:
core memory: durable high-impact identity/policy/user facts (get_core)latest memory: recency slice by timestamp (get_latest)long-term memory: semantic retrieval (search)
You only write through add(contents), and HAEMA updates all layers automatically.
Key Changes (Current)
add(contents)runs a single N:M reconstruction pass per call.- Embedding is split into query/document interfaces:
embed_query(...)embed_document(...)
- no-related special path is removed; one reconstruction schema is used.
- reconstruction schema:
memories: list[str]coverage: "complete" | "incomplete"
Installation
pip install haema
Development:
pip install -e ".[dev]"
Quick Start
from haema import EmbeddingClient, LLMClient, Memory
class MyEmbeddingClient(EmbeddingClient):
...
class MyLLMClient(LLMClient):
...
m = Memory(
path="./haema_store", # storage root
output_dimensionality=1536, # embedding vector width
embedding_client=MyEmbeddingClient(),
llm_client=MyLLMClient(),
merge_top_k=3, # related candidates per input
merge_distance_cutoff=0.25, # related-memory distance threshold
)
m.add([
"The user prefers concise and actionable responses.",
"The user is building HAEMA on top of ChromaDB.",
])
print(m.get_core()) # str
print(m.get_latest(begin=1, count=5)) # list[str]
print(m.search("user preference", n=3)) # list[str]
Real provider example:
examples/google_genai_example.py
Public API
Constructor
Memory(path, output_dimensionality, embedding_client, llm_client, merge_top_k=3, merge_distance_cutoff=0.25)
path: str | Path: storage root directoryoutput_dimensionality: int: required embedding dimension (> 0)embedding_client: EmbeddingClient: query/document embedding adapterllm_client: LLMClient: structured-output adaptermerge_top_k: int: related candidate count per new content (default3, must be> 0)merge_distance_cutoff: float: related-memory distance threshold (default0.25, must be>= 0)
Validation:
output_dimensionality <= 0->ValueErrormerge_top_k <= 0->ValueErrormerge_distance_cutoff < 0->ValueError- missing
chromadb->ImportError
Methods
get_core() -> str: returns full<path>/core.mdtext.get_latest(begin: int, count: int) -> list[str]: 1-indexed latest slice sorted by descending timestamp.search(content: str, n: int) -> list[str]: semantic search over long-term memory documents.add(contents: str | list[str]) -> None: single write API that updates long-term/latest/core layers.
Method behavior:
get_latest(begin < 1)raisesValueErrorget_latest(count <= 0)returns[]search(n <= 0)returns[]add(str)runs pre-memory split first;add(list[str])uses normalized list items directly
How To Implement Adapters
EmbeddingClient
embed_query(texts, output_dimensionality) -> np.ndarrayembed_document(texts, output_dimensionality) -> np.ndarray
Checklist:
- return a 2D
numpy.ndarray - dtype must be
float32 - shape must be
(len(texts), output_dimensionality) - keep query/document task settings separated when your provider supports it
LLMClient
generate_structured(system_prompt, user_prompt, response_model) -> dict[str, Any]
Checklist:
- return a
dict[str, Any]parseable byresponse_model.model_validate(...) - propagate provider failures as exceptions
- avoid returning unstructured free-form text
Reconstruction Schema
HAEMA uses structured reconstruction output for long-term memory updates:
class MemoryReconstructionResponse(BaseModel):
memories: list[str]
coverage: Literal["complete", "incomplete"]
If output is empty or coverage == "incomplete", HAEMA runs one refinement pass.
If it still fails, HAEMA safely falls back to normalized contents.
Prompt Contracts (Layer Responsibility)
HAEMA uses three independent prompt stages with separate outputs:
- pre-memory split:
- input: one raw add string
- output schema:
PreMemorySplitResponse(contents) - responsibility: split factual units only (no core policy decision)
- reconstruction:
- input: related memories + new contents
- output schema:
MemoryReconstructionResponse(memories, coverage) - responsibility: generate long-term memories only
- core update:
- input: current core + reconstructed new memories
- output schema:
CoreUpdateResponse(should_update, core_markdown) - responsibility: conservative core update only
Prompt user blocks are boundary-labeled with tags such as:
<raw_input> ... </raw_input><related_memories> ... </related_memories><new_contents> ... </new_contents><current_core_markdown> ... </current_core_markdown><candidate_new_memories> ... </candidate_new_memories>
These tags are prompt-boundary markers for model clarity, not parser/runtime control logic.
Core Memory Policy
Core memory should keep only durable, high-impact, high-confidence information. By prompt policy, candidate items should pass:
- durability across sessions
- material impact on future agent behavior
- high confidence grounded in evidence
Core prompt policy also enforces:
- strict section routing to one of
SOUL/TOOLS/RULE/USER - exclusion of temporary/session-only/transient logs and noise
- compact high-signal output with a soft target budget around 8 bullets total
Storage Layout
Given path="./haema_store":
- long-term vector DB:
./haema_store/db - core markdown:
./haema_store/core.md - latest index DB:
./haema_store/latest.sqlite3
Long-term metadata fields:
timestamp(UTC ISO8601)timestamp_ms(Unix epoch milliseconds)
How add() Works
- Normalize input strings.
- if
contentsis a singlestr, HAEMA first expands it into multiple pre-memory items via structured LLM output
- if
- Batch query-embed all
contents. - For each query, fetch top-k and keep matches with distance cutoff.
- Union related memories by
id. - Run one reconstruction call with:
- related memory documents (may be empty)
- all new contents
- Upsert reconstructed memories with document embeddings.
- Delete replaced related IDs only after upsert succeeds.
- Update core once per
add()call.
Breaking Changes
Compared to older builds:
EmbeddingClient.embed(...)is removed.NoRelatedMemoryResponseis removed.MemorySynthesisResponse(update: list[str])is replaced byMemoryReconstructionResponse.merge_top_kdefault changed from5to3.
Documentation
docs/index.mddocs/usage.mddocs/api.mddocs/architecture.mddocs/release.md
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file haema-0.4.0.tar.gz.
File metadata
- Download URL: haema-0.4.0.tar.gz
- Upload date:
- Size: 25.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1735d7bacc08881931289e13360355f5772a84b81f56d3c56e461d19a87bf334
|
|
| MD5 |
690c1882f99f16bb292f77e949086f5d
|
|
| BLAKE2b-256 |
becad8109be3bb33f76598c87a74e8ba3eef34d41ff9b29b3bfc27428109c142
|
Provenance
The following attestation bundles were made for haema-0.4.0.tar.gz:
Publisher:
publish-pypi.yml on smturtle2/haema
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
haema-0.4.0.tar.gz -
Subject digest:
1735d7bacc08881931289e13360355f5772a84b81f56d3c56e461d19a87bf334 - Sigstore transparency entry: 977459218
- Sigstore integration time:
-
Permalink:
smturtle2/haema@d8a3e7bb695b6262b49ecbc39ed68a6dd6d0d09b -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/smturtle2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@d8a3e7bb695b6262b49ecbc39ed68a6dd6d0d09b -
Trigger Event:
push
-
Statement type:
File details
Details for the file haema-0.4.0-py3-none-any.whl.
File metadata
- Download URL: haema-0.4.0-py3-none-any.whl
- Upload date:
- Size: 15.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6b65b4b7997547da58554b3e94a0edd6e6ea772b8ddcf76993f1a369ccc796b
|
|
| MD5 |
e1dda52f3f673270bb61bd46a597db3e
|
|
| BLAKE2b-256 |
599fd49d131e87b249b7c6f59e3b25c9c9a83d14a8691bd768cea7f43e2467fa
|
Provenance
The following attestation bundles were made for haema-0.4.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on smturtle2/haema
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
haema-0.4.0-py3-none-any.whl -
Subject digest:
c6b65b4b7997547da58554b3e94a0edd6e6ea772b8ddcf76993f1a369ccc796b - Sigstore transparency entry: 977459224
- Sigstore integration time:
-
Permalink:
smturtle2/haema@d8a3e7bb695b6262b49ecbc39ed68a6dd6d0d09b -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/smturtle2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@d8a3e7bb695b6262b49ecbc39ed68a6dd6d0d09b -
Trigger Event:
push
-
Statement type: