Skip to main content

EverAlgo boundary: MemCell extractors (chat / workspace / agent).

Project description

everalgo-boundary

Chat boundary detection for EverAlgo — segments a flat list of ChatMessage objects into coherent MemCell slices using an LLM-based batch algorithm.

See the umbrella project: EverAlgo monorepo and the architecture document at docs/concepts/architecture.md.

Install

pip install everalgo-boundary

For the user-scenario class facade, install everalgo-user-memory instead — it re-exports BoundaryDetector which wraps this package.

What this distribution provides

Symbol Role
detect_boundaries Low-level async function: (list[ChatMessage], *, llm, is_final, ...) → DetectionResult
DetectionResult NamedTuple(cells: list[MemCell], tail: list[ChatMessage])

The class-style facades (BoundaryDetector for user-scenario chat, AgentBoundaryDetector for agent trajectories with tool calls) live in everalgo-user-memory and everalgo-agent-memory respectively.

Quick start

import asyncio
import json

from everalgo.boundary import detect_boundaries
from everalgo.llm.types import ChatMessage as LLMChatMessage, ChatResponse
from everalgo.testing.fake_llm import FakeLLMClient
from everalgo.types import ChatMessage

_BOUNDARY_JSON = json.dumps(
    {"reasoning": "single topic", "boundaries": [], "should_wait": False}
)

async def main() -> None:
    fake = FakeLLMClient(responses=[ChatResponse(content=_BOUNDARY_JSON, model="fake")])
    messages = [
        ChatMessage(id="m1", role="user",   content="Let's talk about deployment.",     timestamp=1_700_000_000_000, sender_id="u_alice"),
        ChatMessage(id="m2", role="assistant", content="Sure — what's the target env?",  timestamp=1_700_000_001_000, sender_id="assistant"),
        ChatMessage(id="m3", role="user",   content="K8s. Switching topic: lunch?",     timestamp=1_700_000_002_000, sender_id="u_alice"),
    ]

    # Streaming: hold `tail` between calls; pass prior tail + new messages each time.
    result = await detect_boundaries(messages, llm=fake)
    cells, tail = result  # NamedTuple unpacking

    # End-of-session: tail is forced into the last cell.
    result = await detect_boundaries(messages, llm=fake, is_final=True)
    assert result.tail == []
    for mc in result.cells:
        print(mc.timestamp, len(mc.items))


asyncio.run(main())

The streaming state machine

detect_boundaries deliberately holds back trailing messages as tail — the LLM cannot know whether a conversation continues beyond the last seen message. The caller maintains state:

tail: list[ChatMessage] = []

for batch in incoming_batches:
    result = await detect_boundaries(tail + batch, llm=client)
    await persist(result.cells)
    tail = result.tail

# Session ends — flush everything.
final = await detect_boundaries(tail, llm=client, is_final=True)
await persist(final.cells)

Tokenizer utilities

everalgo._tokenize (in everalgo-core) exposes two module-private utilities used by boundary algorithms; not part of the public surface:

  • count_tokens(text: str) → int — token count under OpenAI o200k_base encoding via tiktoken.
  • force_split(text: str, *, max_tokens: int) → list[str] — last-resort token-bounded chunking; no semantic awareness.

Stubs

WorkspaceMemCellExtractor (Jira / Email / Confluence) is a placeholder in v0.x — all methods raise NotImplementedError. It is deliberately excluded from everalgo.boundary.__all__; import it from everalgo.boundary.workspace directly if you need the reserved name. Implementation lands in a future minor bump when the RawData contract is finalised.

Related distributions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

everalgo_boundary-0.2.1.tar.gz (21.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

everalgo_boundary-0.2.1-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file everalgo_boundary-0.2.1.tar.gz.

File metadata

  • Download URL: everalgo_boundary-0.2.1.tar.gz
  • Upload date:
  • Size: 21.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for everalgo_boundary-0.2.1.tar.gz
Algorithm Hash digest
SHA256 23a93bb36b06251e5a85765f68640c55bcfe0f1faf8b025e44ef8857a5ce36f9
MD5 66e93ca065dd6941bf3fc084dd8f6d09
BLAKE2b-256 496e3c60e7c253948e42906108bff3f8227733804801c3ea96e04f7895366686

See more details on using hashes here.

File details

Details for the file everalgo_boundary-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for everalgo_boundary-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 95ac8982291041b5641b13c915790ef20cca8c44018c2001ff608b13c7ae6b8d
MD5 ca831526e5fb5cbebeccb1ea0ec75c77
BLAKE2b-256 736f6cb00d36ee007360ac0e2e62c44da782ef881f0f5def9f9d5a64142d88fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page