A local memory + orchestration framework for token-efficient LLM agents: planner/executor split, retrieved tool descriptions, batch tool calls, Reflexion-style success-pattern mining, tactical/strategic memory over execution traces, polyglot subprocess tools, agent-as-tool composition, Anthropic prompt caching, per-run usage events, and TOON I/O.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Rushour0

These details have not been verified by PyPI

Project description

fabri

A local, open-source memory + orchestration layer for a custom LLM agent. Built around two ideas:

SCOPE (arXiv:2512.15374): treat the agent's prompt context as something that evolves automatically from execution traces, via a tactical/strategic memory split with conflict resolution before promotion.
Context engineering over prompt engineering: keep retrieved context compact and just-in-time, give each tool a single clear job, and let tools be polyglot processes behind a uniform JSON contract.

Quickstart

pip install fabri                    # the `fabri` console command lands on your PATH
docker run -p 6333:6333 qdrant/qdrant   # vector store for the agent's memory
export ANTHROPIC_API_KEY=...

fabri init demo && cd demo           # scaffold a runnable starter project
fabri --config agent.yaml run "greet Ada with the hello tool"

fabri init writes an agent.yaml, an example tool under tools/agent_tools/, and a docker-compose.yml — edit those, not the library. For OpenAI models, pip install "fabri[openai]" and set llm.provider: openai.

Setup (from a checkout)

docker compose up -d                 # starts Qdrant on :6333
python3 -m venv .venv && .venv/bin/pip install -e ".[dev]"
export ANTHROPIC_API_KEY=...         # only needed for AnthropicLLMBackend

Installing puts a fabri console command on your PATH — that's the entry point everything below uses. Embeddings run fully locally via sentence-transformers/all-MiniLM-L6-v2 — no embedding API calls, no cloud dependency beyond the LLM itself.

Usage

fabri run "some task description"
fabri --config agent.yaml run "some task description"   # config-driven agent, see docs/creating-an-agent.md
fabri --verbose run "some task description"   # DEBUG logging to console too
fabri inspect-memory "a query to test retrieval"
fabri ingest-traces <session-id>

(python cli.py ... still works from a source checkout — it's a thin shim over the same fabri.cli:main the console script points at.)

See docs/creating-an-agent.md for using this as a library/framework in another project: installing it, writing an agent.yaml, adding tools, and a worked example.

run executes the agent loop, then immediately mines the resulting trace for failures and synthesizes any new guidelines. ingest-traces re-runs that synthesis step against a past session's trace on its own. run and ingest-traces both require ANTHROPIC_API_KEY (they fail fast with a clear message if it's unset, rather than a raw stack trace) — inspect-memory does not, since it only talks to Qdrant.

Every run produces two records, both keyed by session_id: a structured JSONL trace (.fabri/traces/<session_id>.jsonl, the machine-readable record used by the pipeline) and a human-readable log (.fabri/logs/<session_id>.log, always DEBUG-level regardless of --verbose — that flag only controls what's also echoed to the console) with LLM call latency/token usage, tool dispatch latency, and every promotion/dedup decision the pipeline makes.

Both land under .fabri/ in the directory you run from (override with $FABRI_HOME), so each consuming project keeps its own traces and logs rather than scattering them into wherever the package happens to be installed. Add .fabri/ to your project's .gitignore.

Each run returns an outcome: success (clean), success_with_recovery (finished, but at least one tool call failed along the way), or incomplete (hit the step limit with no final answer).

Architecture

src/fabri/        # src/ layout: repo is "fabri", package is "fabri"
  memory/        embeddings, schema, Qdrant store, compression, pruning/promotion
  orchestrator/  retrieval, trace logging, the trace -> guideline pipeline
  tools/         manifest schema, subprocess runner, example tools, agent-as-tool adapter
  core/          the ReAct agent loop, decompose, and the pluggable LLM backend
  runtime.py     build_llm/build_tools/build_tool_defs -- shared by cli and agent_runner_tool
  admin.py       admin-only config inspection + dashboard (stub auth seam, see below)
  config.py      YAML agent config loader (DEFAULT_CONFIG + load_config)
  toon.py        TOON codec: compact JSON-shaped encoding for tool results (token savings)
  paths.py       project-local .fabri/ resolution for traces + logs
  cli.py         the `fabri` console command (composition over the public API)
  __init__.py    public API: run_agent, load_config, ToolRegistry, QdrantMemoryStore, ...
cli.py           thin shim: `python cli.py ...` -> fabri.cli:main

Agents as tools

A tools.agents entry in a config exposes a different agent.yaml as a tool on this one (fabri/tools/agent_tool.py builds the manifest, agent_runner_tool.py runs the sub-agent as an ordinary subprocess tool — stdin {"task": ...}, stdout {"final_text", "outcome"}). This is composition, not a new orchestrator: each sub-agent is just another tool call in the parent's normal ReAct loop. See ludexel/.agent/game_content_agent.yaml for a worked example wiring four domain agents (character/tiles/map/story) into one top-level agent.

A tools.agents entry may carry optional model / max_tokens keys that override the sub-agent's llm.model / llm.max_tokens at spawn time, so the parent can run on Sonnet while a cheap classifier sub-agent runs on Haiku without duplicating the full agent.yaml:

tools:
  agents:
    - name: classify
      description: Classify a snippet into one of N labels.
      config: tools/agent_tools/classifier.yaml
      model: claude-haiku-4-5      # cheap override; sub-agent's own yaml keeps the default
      max_tokens: 256

Admin CLI / dashboard

cli.py admin config --config agent.yaml prints the merged config plus the resolved tool registry as JSON; cli.py admin dashboard --config agent.yaml prints a human-readable summary (agent/llm/tools/memory counts). Both are gated by require_admin() (fabri/admin.py): if FABRI_ADMIN_TOKEN is unset the gate is open (no auth backend exists yet), but once it's set, --admin-token must match it. This is a placeholder seam for real auth (SSO/API gateway/etc.), not a security boundary by itself.

Tool contract

A tool is a JSON manifest (name, description, command, schemas, timeout) next to an executable in any language. runner.py invokes command as a subprocess, writes the call's args as JSON to stdin, and parses stdout as JSON. The result is always normalized to {ok, error?, result?, stderr?} before the agent sees it — malformed output, timeouts, and tool error exits are distinct, well-defined failure modes rather than crashes. See tools/examples/ for a Python tool (echo), a Go tool (sum), and a deliberately misbehaving tool (broken) used only by the test suite.

Token efficiency (TOON)

Tool results are encoded into the model's context as TOON (Token-Oriented Object Notation, toon.py) rather than JSON by default — a compact indentation-based encoding that drops braces and collapses uniform arrays to a single header row plus data rows, typically ~30–40% fewer characters on tabular results. The framework encodes this itself (no model reliability risk); the trace/logs keep raw JSON. Pipeline: json → toon → llm inbound, and JSON outbound by default (tools.result_format, agent.output_format — see docs/creating-an-agent.md). Native tool-call arguments stay provider JSON. The codec round-trips any JSON-shaped value and fabri.toon.encode/.decode are public.

Memory lifecycle

Every agent run logs a JSONL trace (traces/<session_id>.jsonl).
orchestrator/pipeline.py scans a trace for tool-call failures and asks the LLM to compress each into a short, generalized guideline (memory/compress.py enforces a hard token cap regardless of what the LLM returns).
memory/pruning.py checks the new guideline against existing tactical entries by cosine similarity. A near-duplicate increments that entry's recurrence count instead of inserting a copy; once it has recurred across 3 distinct sessions (PROMOTION_THRESHOLD_SESSIONS), it's promoted to strategic.
orchestrator/retrieval.py embeds the next task, pulls the top-k most relevant guidelines (tactical + strategic) from Qdrant, and formats them as a compact bullet list injected into the agent's system prompt. If any available tool's name appears in the task text, a second query filters by that tool's tag (MemoryEntry.tools, populated in step 2 from whichever tool actually failed) and those hits are guaranteed inclusion — this surfaces tool-specific guidelines even when their wording is too dissimilar from the query for cosine similarity alone to rank them in the top-k. (A graph-DB-backed version of this was considered and rejected: the only capability needed was a single-hop "guidelines for this tool" lookup, which a payload filter gives you for free — a second database would have added a dependency, a migration risk, and a way for the graph and the vector store to drift out of sync, for no real capability gain.)

This is what closes the loop: a failure in session N becomes retrievable context in session N+1, without a human re-writing the prompt by hand.

Known v1 limitations (intentional, not bugs)

No embedding-model migration path. Each memory point stores a model_version payload field, but swapping the embedding model means recreating the Qdrant collection from scratch — old vectors are not re-embedded automatically.
Single-writer assumption. Concurrent distinct agent processes ingesting guidelines at the same time are not locked against each other. This is made safe-by-construction for the common case (same guideline text) via deterministic point IDs (hash of the compressed text), but two different writers racing on genuinely different new guidelines is untested.
No automatic eviction. memory/pruning.py:evict_stale exists but is not wired into a scheduled job — run it manually if the strategic store grows large with low-value entries.

Tests

.venv/bin/python -m pytest tests/ -q

Covers store round-trip/idempotency, the dedup + promotion rule, the tool runner's normalized failure contract (Python + Go + malformed output + timeout), the token-cap enforcement in compress.py, the TOON codec (round-trip + fuzz), and the fabri init scaffold.

Releasing

Tag a version (git tag v0.1.0 && git push origin v0.1.0) and GitHub Actions publishes to PyPI via Trusted Publishing — see RELEASING.md for the one-time setup.

License

Apache-2.0 © Rushikesh Patade.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Rushour0

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.4

Jun 21, 2026

0.4.3

Jun 21, 2026

This version

0.4.2

Jun 20, 2026

0.4.1

Jun 20, 2026

0.4.0

Jun 20, 2026

0.3.0

Jun 20, 2026

0.2.1

Jun 20, 2026

0.2.0

Jun 20, 2026

0.1.0

Jun 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fabri-0.4.2.tar.gz (106.9 kB view details)

Uploaded Jun 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fabri-0.4.2-py3-none-any.whl (95.9 kB view details)

Uploaded Jun 20, 2026 Python 3

File details

Details for the file fabri-0.4.2.tar.gz.

File metadata

Download URL: fabri-0.4.2.tar.gz
Upload date: Jun 20, 2026
Size: 106.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fabri-0.4.2.tar.gz
Algorithm	Hash digest
SHA256	`fe0a1c2cd130e040a62f305aee88647e9d6949d1db00798db6333f7a2916056e`
MD5	`004003c4e37b166cb1996d9939658cc8`
BLAKE2b-256	`46a14a591105cda04c8a2265907d6e90ae3a3a5d9babaf84eaa22dbcc2085751`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fabri-0.4.2.tar.gz:

Publisher: release.yml on Rushour0/fabri

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fabri-0.4.2.tar.gz
- Subject digest: fe0a1c2cd130e040a62f305aee88647e9d6949d1db00798db6333f7a2916056e
- Sigstore transparency entry: 1888043073
- Sigstore integration time: Jun 20, 2026
Source repository:
- Permalink: Rushour0/fabri@b6398503c56c165f9222e8ed04189901bb549f8d
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/Rushour0
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@b6398503c56c165f9222e8ed04189901bb549f8d
- Trigger Event: push

File details

Details for the file fabri-0.4.2-py3-none-any.whl.

File metadata

Download URL: fabri-0.4.2-py3-none-any.whl
Upload date: Jun 20, 2026
Size: 95.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fabri-0.4.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bbd3f72d9fcd90f2cbded4638e566685e1f431556f6bc81e4f7a6f234d1884f9`
MD5	`10806d399dda64d841aaefe1f8b0b2b9`
BLAKE2b-256	`d5c27fcc073c9c1caba7aa08e31c33fc7975d63e39a7e7074e6ae2c01f1f4945`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fabri-0.4.2-py3-none-any.whl:

Publisher: release.yml on Rushour0/fabri

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fabri-0.4.2-py3-none-any.whl
- Subject digest: bbd3f72d9fcd90f2cbded4638e566685e1f431556f6bc81e4f7a6f234d1884f9
- Sigstore transparency entry: 1888043216
- Sigstore integration time: Jun 20, 2026
Source repository:
- Permalink: Rushour0/fabri@b6398503c56c165f9222e8ed04189901bb549f8d
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/Rushour0
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@b6398503c56c165f9222e8ed04189901bb549f8d
- Trigger Event: push

fabri 0.4.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

fabri

Quickstart

Setup (from a checkout)

Usage

Architecture

Agents as tools

Admin CLI / dashboard

Tool contract

Token efficiency (TOON)

Memory lifecycle

Known v1 limitations (intentional, not bugs)

Tests

Releasing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance