HAEMA memory framework built on ChromaDB

Project description

HAEMA

English | 한국어

HAEMA is an agent memory framework built on ChromaDB.

It provides three memory modes through a single write API:

core memory: durable high-impact identity/policy/user facts (get_core)
latest memory: recency slice by timestamp (get_latest)
long-term memory: semantic retrieval (search)

You only write through add(contents), and HAEMA updates all layers automatically.

Key Changes (Current)

add(contents) runs a single N:M reconstruction pass per call.
Embedding is split into query/document interfaces:
- embed_query(...)
- embed_document(...)
no-related special path is removed; one reconstruction schema is used.
reconstruction schema:
- memories: list[str]
- coverage: "complete" | "incomplete"

Installation

pip install haema

Development:

pip install -e ".[dev]"

Quick Start

from haema import Memory

m = Memory(
    path="./haema_store",
    output_dimensionality=1536,
    embedding_client=...,   # your EmbeddingClient implementation
    llm_client=...,         # your LLMClient implementation
    merge_top_k=3,
    merge_distance_cutoff=0.25,
)

m.add([
    "The user prefers concise and actionable responses.",
    "The user is building HAEMA on top of ChromaDB.",
])

print(m.get_core())                    # str
print(m.get_latest(begin=1, count=5)) # list[str]
print(m.search("user preference", 3))  # list[str]

Real provider example:

examples/google_genai_example.py

Public API

Constructor

Memory(path, output_dimensionality, embedding_client, llm_client, merge_top_k=3, merge_distance_cutoff=0.25)

path: storage root directory
output_dimensionality: embedding output dimension
embedding_client: user embedding adapter
llm_client: user structured-output LLM adapter
merge_top_k: related candidate count per new content (default 3)
merge_distance_cutoff: related-memory distance threshold (default 0.25)

Methods

get_core() -> str
get_latest(begin: int, count: int) -> list[str]
search(content: str, n: int) -> list[str]
add(contents: str | list[str]) -> None

Client Interfaces

`EmbeddingClient`

embed_query(texts, output_dimensionality) -> np.ndarray
embed_document(texts, output_dimensionality) -> np.ndarray

Both must return:

2D numpy.ndarray
dtype float32
shape (len(texts), output_dimensionality)

`LLMClient`

generate_structured(system_prompt, user_prompt, response_model) -> dict[str, Any]

Must return a dict parseable by the provided Pydantic model.

Reconstruction Schema

HAEMA uses structured reconstruction output for long-term memory updates:

class MemoryReconstructionResponse(BaseModel):
    memories: list[str]
    coverage: Literal["complete", "incomplete"]

If output is empty or coverage == "incomplete", HAEMA runs one refinement pass. If it still fails, HAEMA safely falls back to normalized contents.

Prompt Contracts (Layer Responsibility)

HAEMA uses three independent prompt stages with separate outputs:

pre-memory split:
- input: one raw add string
- output schema: PreMemorySplitResponse(contents)
- responsibility: split factual units only (no core policy decision)
reconstruction:
- input: related memories + new contents
- output schema: MemoryReconstructionResponse(memories, coverage)
- responsibility: generate long-term memories only
core update:
- input: current core + reconstructed new memories
- output schema: CoreUpdateResponse(should_update, core_markdown)
- responsibility: conservative core update only

Prompt user blocks are boundary-labeled with tags such as:

<raw_input> ... </raw_input>
<related_memories> ... </related_memories>
<new_contents> ... </new_contents>
<current_core_markdown> ... </current_core_markdown>
<candidate_new_memories> ... </candidate_new_memories>

These tags are prompt-boundary markers for model clarity, not parser/runtime control logic.

Core Memory Policy

Core memory should keep only durable, high-impact, high-confidence information. By prompt policy, candidate items should pass:

durability across sessions
material impact on future agent behavior
high confidence grounded in evidence

Core prompt policy also enforces:

strict section routing to one of SOUL/TOOLS/RULE/USER
exclusion of temporary/session-only/transient logs and noise
compact high-signal output with a soft target budget around 8 bullets total

Storage Layout

Given path="./haema_store":

long-term vector DB: ./haema_store/db
core markdown: ./haema_store/core.md
latest index DB: ./haema_store/latest.sqlite3

Long-term metadata fields:

timestamp (UTC ISO8601)
timestamp_ms (Unix epoch milliseconds)

How `add()` Works

Normalize input strings.
- if contents is a single str, HAEMA first expands it into multiple pre-memory items via structured LLM output
Batch query-embed all contents.
For each query, fetch top-k and keep matches with distance cutoff.
Union related memories by id.
Run one reconstruction call with:
- related memory documents (may be empty)
- all new contents
Upsert reconstructed memories with document embeddings.
Delete replaced related IDs only after upsert succeeds.
Update core once per add() call.

Breaking Changes

Compared to older builds:

EmbeddingClient.embed(...) is removed.
NoRelatedMemoryResponse is removed.
MemorySynthesisResponse(update: list[str]) is replaced by MemoryReconstructionResponse.
merge_top_k default changed from 5 to 3.

Documentation

docs/index.md
docs/usage.md
docs/api.md
docs/architecture.md
docs/release.md

License

MIT

Project details

Release history Release notifications | RSS feed

0.4.0

Feb 23, 2026

This version

0.3.0

Feb 22, 2026

0.2.0

Feb 21, 2026

0.1.0

Feb 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haema-0.3.0.tar.gz (22.5 kB view details)

Uploaded Feb 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

haema-0.3.0-py3-none-any.whl (13.0 kB view details)

Uploaded Feb 22, 2026 Python 3

File details

Details for the file haema-0.3.0.tar.gz.

File metadata

Download URL: haema-0.3.0.tar.gz
Upload date: Feb 22, 2026
Size: 22.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for haema-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`91a1eb6f4b40800103b5bb9a30b587da530178a3ef0874a1c0102c3d0af213ef`
MD5	`5f100b1647666113ac87148fd95d6d9e`
BLAKE2b-256	`6d0f255d2f73d52a06254078a548c694ffc5719f37e5ffe9a2d14231f902415a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for haema-0.3.0.tar.gz:

Publisher: publish-pypi.yml on smturtle2/haema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: haema-0.3.0.tar.gz
- Subject digest: 91a1eb6f4b40800103b5bb9a30b587da530178a3ef0874a1c0102c3d0af213ef
- Sigstore transparency entry: 976202084
- Sigstore integration time: Feb 22, 2026
Source repository:
- Permalink: smturtle2/haema@5ab6279624b887fa91c895593dd089d74d22c837
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/smturtle2
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@5ab6279624b887fa91c895593dd089d74d22c837
- Trigger Event: push

File details

Details for the file haema-0.3.0-py3-none-any.whl.

File metadata

Download URL: haema-0.3.0-py3-none-any.whl
Upload date: Feb 22, 2026
Size: 13.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for haema-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c855770a8fbe4cda0835c2428baf80a980857da7faf4cfc2d0547b97eaa177b8`
MD5	`a99f2c2e60fd913fc9cbb1b628cf89d8`
BLAKE2b-256	`38088d1499aa0f6577a19687138a1e71bb0255130d6437858b07954b54f8a7c5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for haema-0.3.0-py3-none-any.whl:

Publisher: publish-pypi.yml on smturtle2/haema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: haema-0.3.0-py3-none-any.whl
- Subject digest: c855770a8fbe4cda0835c2428baf80a980857da7faf4cfc2d0547b97eaa177b8
- Sigstore transparency entry: 976202089
- Sigstore integration time: Feb 22, 2026
Source repository:
- Permalink: smturtle2/haema@5ab6279624b887fa91c895593dd089d74d22c837
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/smturtle2
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@5ab6279624b887fa91c895593dd089d74d22c837
- Trigger Event: push

haema 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

HAEMA

Key Changes (Current)

Installation

Quick Start

Public API

Constructor

Methods

Client Interfaces

EmbeddingClient

LLMClient

Reconstruction Schema

Prompt Contracts (Layer Responsibility)

Core Memory Policy

Storage Layout

How add() Works

Breaking Changes

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`EmbeddingClient`

`LLMClient`

How `add()` Works