Skip to main content

EverAlgo boundary: MemCell extractors (chat / workspace / agent).

Project description

everalgo-boundary

Chat boundary detection for EverAlgo — segments a flat list of ChatMessage objects into coherent MemCell slices using an LLM-based batch algorithm.

See the umbrella project: EverAlgo monorepo and the architecture document at docs/concepts/architecture.md.

Install

pip install everalgo-boundary

For the user-scenario class facade, install everalgo-user-memory instead — it re-exports BoundaryDetector which wraps this package.

What this distribution provides

Symbol Role
detect_boundaries Low-level async function: (list[ChatMessage], *, llm, is_final, ...) → DetectionResult
DetectionResult NamedTuple(cells: list[MemCell], tail: list[ChatMessage])

The class-style facades (BoundaryDetector for user-scenario chat, AgentBoundaryDetector for agent trajectories with tool calls) live in everalgo-user-memory and everalgo-agent-memory respectively.

Quick start

import asyncio
import json

from everalgo.boundary import detect_boundaries
from everalgo.llm.types import ChatMessage as LLMChatMessage, ChatResponse
from everalgo.testing.fake_llm import FakeLLMClient
from everalgo.types import ChatMessage

_BOUNDARY_JSON = json.dumps(
    {"reasoning": "single topic", "boundaries": [], "should_wait": False}
)

async def main() -> None:
    fake = FakeLLMClient(responses=[ChatResponse(content=_BOUNDARY_JSON, model="fake")])
    messages = [
        ChatMessage(id="m1", role="user",   content="Let's talk about deployment.",     timestamp=1_700_000_000_000, sender_id="u_alice"),
        ChatMessage(id="m2", role="assistant", content="Sure — what's the target env?",  timestamp=1_700_000_001_000, sender_id="assistant"),
        ChatMessage(id="m3", role="user",   content="K8s. Switching topic: lunch?",     timestamp=1_700_000_002_000, sender_id="u_alice"),
    ]

    # Streaming: hold `tail` between calls; pass prior tail + new messages each time.
    result = await detect_boundaries(messages, llm=fake)
    cells, tail = result  # NamedTuple unpacking

    # End-of-session: tail is forced into the last cell.
    result = await detect_boundaries(messages, llm=fake, is_final=True)
    assert result.tail == []
    for mc in result.cells:
        print(mc.timestamp, len(mc.items))


asyncio.run(main())

The streaming state machine

detect_boundaries deliberately holds back trailing messages as tail — the LLM cannot know whether a conversation continues beyond the last seen message. The caller maintains state:

tail: list[ChatMessage] = []

for batch in incoming_batches:
    result = await detect_boundaries(tail + batch, llm=client)
    await persist(result.cells)
    tail = result.tail

# Session ends — flush everything.
final = await detect_boundaries(tail, llm=client, is_final=True)
await persist(final.cells)

Tokenizer utilities

everalgo._tokenize (in everalgo-core) exposes two module-private utilities used by boundary algorithms; not part of the public surface:

  • count_tokens(text: str) → int — token count under OpenAI o200k_base encoding via tiktoken.
  • force_split(text: str, *, max_tokens: int) → list[str] — last-resort token-bounded chunking; no semantic awareness.

Stubs

WorkspaceMemCellExtractor (Jira / Email / Confluence) is a placeholder in v0.x — all methods raise NotImplementedError. It is deliberately excluded from everalgo.boundary.__all__; import it from everalgo.boundary.workspace directly if you need the reserved name. Implementation lands in a future minor bump when the RawData contract is finalised.

Related distributions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

everalgo_boundary-0.2.0.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

everalgo_boundary-0.2.0-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file everalgo_boundary-0.2.0.tar.gz.

File metadata

  • Download URL: everalgo_boundary-0.2.0.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for everalgo_boundary-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a04a3fa130cb58d0987c72ff603cb2790e07fceb9a81e65984b6b8a0a4e36ff9
MD5 92ba1bf4e60a1a257d81840c907cb55f
BLAKE2b-256 344eb086857a79aee04d9f919ec9c82aec113c86a1da3c38bdb90d5b808c1921

See more details on using hashes here.

File details

Details for the file everalgo_boundary-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for everalgo_boundary-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 50f1c3aade13b24bf2b14d581ce492aad2a3a0e999003e43ddf3b338258e3d40
MD5 2843d7aeca6fe06ad0850b61123c4c89
BLAKE2b-256 7ba5cdb26abc99b2655e6ee5f5aa7fcd1c7082a5f69a1012bec72a1dc385db51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page