Skip to main content

Complete agent memory: reasoning queries + vector search + auto-extraction. Decision intelligence for LangGraph, CrewAI, Google ADK, OpenAI Agents SDK, Pydantic AI, smolagents, LlamaIndex, Haystack, and CAMEL-AI.

Project description

FlowScript

flowscript-agents

Agent memory that tracks why you decided, what conflicts, and what's blocked. Not just what was said.

Tests PyPI License: MIT Python


Plain text in. Typed reasoning queries out:

from openai import OpenAI
from flowscript_agents import UnifiedMemory
from flowscript_agents.embeddings import OpenAIEmbeddings

client = OpenAI()
llm = lambda prompt: (client.chat.completions.create(
    model="gpt-4o-mini", messages=[{"role": "user", "content": prompt}]
).choices[0].message.content or "")

with UnifiedMemory("agent-memory.json", embedder=OpenAIEmbeddings(), llm=llm) as mem:
    mem.add("Redis gives sub-ms reads which is critical for our UX requirements")
    mem.add("Redis clustering costs $200/month which exceeds our infrastructure budget of $50/month")
    mem.add("PostgreSQL gives us rich queries at $15/month but read latency is 10-50ms")

    tensions = mem.memory.query.tensions()
    # → TensionsResult(1 tension, axes=['cost vs budget'])
    # The LLM detected the $200/month vs $50/month contradiction
    # and preserved both sides as a queryable tension

    blocked = mem.memory.query.blocked()
    # → BlockedResult(0 blockers)

    why = mem.memory.query.why(node_id)
    # → CausalAncestry: full chain backward from any node

Five queries that no vector store can answer — why(), tensions(), blocked(), alternatives(), whatIf() — over a typed semantic graph. Drop-in adapters for 9 agent frameworks. Hash-chained audit trail. And when memories contradict, we don't delete the old one — we create a queryable tension.

FlowScript — editor with .fs syntax, D3 reasoning graph, and tensions query results


Why FlowScript

Agent memory stores what happened. FlowScript stores why.

Most agent infrastructure is converging on authorization — identity, access control, audit trails for who did what. That's necessary. But it leaves a gap: your agent can prove it was allowed to make a decision, but not why it made it. Researchers call this "strategic blindness" — memory that tracks content without tracking reasoning.

FlowScript sits above your memory store, not instead of it. Google Memory Bank, LangGraph checkpointers, Mem0 — they remember what your agent stored. FlowScript remembers why it decided, what it traded off, and what breaks if you change your mind.


Get Started

MCP Server (Claude Code / Cursor — zero code)

pip install flowscript-agents openai

The openai package is required for extraction, consolidation, and vector search. Without it, add_memory stores raw text and query_tensions won't find anything.

Add to your editor's MCP config:

Claude Code — add to .claude/settings.json in your project (or ~/.claude/settings.json for global):

{
  "mcpServers": {
    "flowscript": {
      "command": "flowscript-mcp",
      "args": ["--memory", "./project-memory.json"],
      "env": {
        "OPENAI_API_KEY": "your-key"
      }
    }
  }
}

Cursor / Windsurf / VS Code — add to .mcp.json in your project root:

{
  "mcpServers": {
    "flowscript": {
      "type": "stdio",
      "command": "flowscript-mcp",
      "args": ["--memory", "./project-memory.json"],
      "env": {
        "OPENAI_API_KEY": "your-key"
      }
    }
  }
}

Fallback: If env passthrough doesn't work in your editor, export the key in your shell before launching:

export OPENAI_API_KEY=your-key

The server auto-detects your API key and configures the full stack:

Key What you get
OPENAI_API_KEY Vector search (text-embedding-3-small) + typed extraction (gpt-4o-mini) + consolidation
ANTHROPIC_API_KEY Typed extraction + consolidation (no embeddings, keyword search fallback)
Neither Raw text storage only. Tools work, but no typed extraction and query_tensions won't find anything.

Without an API key, you get a degraded experience. The server warns on startup and in tool responses.

Embedding Providers

The default is OpenAI text-embedding-3-small. To use a different provider, pass flags in args:

"args": ["--memory", "./project-memory.json", "--embedder", "ollama", "--embedding-model", "nomic-embed-text"]
Flag What it does Default
--embedder Embedding provider: openai, sentence-transformers, or ollama Auto-detected from API key
--embedding-model Model name (provider-specific) text-embedding-3-small (OpenAI)
--llm-model LLM for extraction and consolidation gpt-4o-mini
--no-auto Disable auto-configuration from API keys Off

Local embeddings (free, no API key for embeddings):

Provider Install Example model Notes
Ollama Install Ollama, then ollama pull nomic-embed-text nomic-embed-text Beats text-embedding-3-small. 274MB.
SentenceTransformers pip install sentence-transformers BAAI/bge-m3 Runs on CPU. Downloads on first use.

You still need an LLM API key (OPENAI_API_KEY or ANTHROPIC_API_KEY) for typed extraction and consolidation, even when using local embeddings.

Using Anthropic instead of OpenAI:

With ANTHROPIC_API_KEY set, the server auto-configures extraction and consolidation using Claude Haiku. No vector search (Anthropic has no embedding API), but keyword + temporal search works well. To use a different Anthropic model:

"args": ["--memory", "./project-memory.json", "--llm-model", "claude-sonnet-4-6"]

Then add the CLAUDE.md snippet to your project. This is what turns tools into a workflow. It tells your agent when to record decisions, surface tensions before new choices, and check blockers at session start. Without it, the tools are available but passive. With it, your agent proactively tracks your project's reasoning.

Python SDK

pip install flowscript-agents                       # Core
pip install flowscript-agents[langgraph]            # + LangGraph adapter
pip install flowscript-agents[crewai]               # + CrewAI adapter
pip install flowscript-agents[all]                  # Everything (9 frameworks)

Bracket syntax matters — it installs framework-specific dependencies.


How It Works

FlowScript operates at three levels. Pick where you start:

Level 1 — Reasoning graph, no API keys. Use the Memory class directly to build typed nodes (thoughts, questions, decisions) with explicit relationships (causes, tensions, alternatives). Sub-ms queries, zero external deps. This is the power-user API. Full docs →

Level 2 — Add vector search. Pass an embedder to UnifiedMemory for semantic similarity search alongside reasoning queries. Three providers: OpenAI, SentenceTransformers, Ollama. Details →

Level 3 — Full stack. Add an llm for auto-extraction (plain text → typed nodes) and a consolidation_provider for contradiction handling. Or just use the MCP server, which auto-configures all of this from a single API key.


First 5 Minutes

With the MCP server running and the CLAUDE.md snippet in your project, try this conversation:

"I need to decide between PostgreSQL and MongoDB for our user data. We need ACID compliance for payments but flexibility for user profiles."

Your agent stores the decision context, tradeoffs, and rationale automatically. Now introduce contradictory information:

"Actually, I've been looking at DynamoDB. The scale requirements might matter more than I thought."

Now ask:

"What tensions do we have in our architecture decisions?"

FlowScript preserved both perspectives (PostgreSQL's ACID compliance vs DynamoDB's scalability) as a queryable tension instead of deleting the first decision. That's what RELATE > DELETE means in practice.

After a few sessions, try:

  • "What's blocking our progress?" surfaces blockers and their downstream impact
  • "Why did we choose PostgreSQL originally?" traces the full causal chain
  • "What if we switch to DynamoDB?" maps the downstream consequences

After 20 sessions, you have a curated knowledge base of your project's decisions, not a pile of notes. Knowledge that stays relevant graduates through temporal tiers. One-off observations fade naturally.


Works With Your Stack

Drop-in adapters that implement your framework's native interface. Same API you already use — plus query.tensions().

from flowscript_agents.langgraph import FlowScriptStore

with FlowScriptStore("agent-memory.json") as store:
    # Standard LangGraph BaseStore operations
    store.put(("agents", "planner"), "db_decision", {"value": "chose Redis for speed"})
    items = store.search(("agents", "planner"), query="Redis")

    # What's new — typed reasoning queries on the same data
    tensions = store.memory.query.tensions()
    blockers = store.memory.query.blocked()

    # Resolve a store key to its full reasoning context
    node = store.resolve(("agents", "planner"), "db_decision")
Framework Adapter Install
LangGraph FlowScriptStoreBaseStore [langgraph]
CrewAI FlowScriptStorageStorageBackend [crewai]
Google ADK FlowScriptMemoryServiceBaseMemoryService [google-adk]
OpenAI Agents FlowScriptSessionSession [openai-agents]
Pydantic AI FlowScriptDeps → Deps + tools [pydantic-ai]
smolagents FlowScriptMemory → Tool protocol [smolagents]
LlamaIndex FlowScriptMemoryBlockBaseMemoryBlock [llamaindex]
Haystack FlowScriptMemoryStoreMemoryStore [haystack]
CAMEL-AI FlowScriptCamelMemoryAgentMemory [camel-ai]

All adapters expose .memory for query access, support with blocks, and accept optional embedder/llm/consolidation_provider for vector search and extraction. Per-framework examples →


When Memories Contradict

Every other memory system handles contradictions by deleting. Mem0's consolidation uses ADD/UPDATE/DELETE/NONE — when facts contradict, the old memory is replaced. LangGraph's langmem does the same. CrewAI's consolidation is flat keep/update/delete.

FlowScript doesn't delete. It relates.

When consolidation detects a contradiction, it creates a RELATE — a tension with a named axis. Both memories survive. The disagreement itself becomes queryable knowledge.

Action What happens
ADD New knowledge, no existing match
UPDATE Enriches existing node with new detail
RELATE Contradiction detected — both sides preserved as a queryable tension
RESOLVE Blocker condition changed — downstream decisions unblocked
SKIP Exact duplicate, no action

You can't audit a deletion. You can query a tension.


Audit Trail

Every mutation is SHA-256 hash-chained, append-only, crash-safe. Verify the full chain in one call:

from flowscript_agents import Memory, MemoryOptions, AuditConfig

mem = Memory.load_or_create("agent.json",
    options=MemoryOptions(audit=AuditConfig(retention_months=84)))

# ... agent does work ...

result = Memory.verify_audit("agent.audit.jsonl")
# → AuditVerifyResult(valid=True, total_entries=42, files_verified=1)

Framework attribution is automatic — every audit entry records which adapter triggered it. Query by time range, event type, adapter, or session. Rotation with gzip compression. on_event callback for SIEM integration. Full audit trail docs →


Session Lifecycle — How Memory Gets Smarter

Just like a mind needs sleep to consolidate memories, your agent's reasoning graph needs regular session wraps to develop intelligence over time. Without consolidation cycles, knowledge accumulates as noise instead of maturing.

Temporal tiers — nodes graduate based on actual use:

Tier Meaning Behavior
current Recent observations May be pruned if not reinforced
developing Emerging patterns (2+ touches) Building confidence
proven Validated through use (3+ touches) Protected from pruning
foundation Core truths Always preserved

Every query touches returned nodes — knowledge that keeps getting queried earns its place. One-off observations fade naturally. Dormant nodes are pruned to the audit trail — archived with full provenance, never destroyed.

Three ways session wraps happen:

  1. Explicit — the LLM calls the session_wrap tool when you say "let's wrap up" (best results)
  2. Auto-wrap — after 5 minutes of inactivity, the MCP server auto-consolidates (safety net, configurable via FLOWSCRIPT_AUTO_WRAP_MINUTES, set to 0 to disable)
  3. Process exit — when the MCP server shuts down, a final consolidation runs automatically

For SDK users — adapters support context managers that auto-wrap:

from flowscript_agents.adapters.langgraph import FlowScriptStore

with FlowScriptStore("agent-memory.json") as store:
    # work happens — all mutations auto-save
    store.put(("agents",), "key", {"value": "data"})
# close() fires automatically → session_wrap() + save

After 20 sessions, your memory is a curated knowledge base, not a pile of notes. Full lifecycle details →


Description Integrity

MCP tool descriptions are the prompts your LLM reads. If they're mutated in-process, the LLM silently follows poisoned instructions. The FlowScript MCP server includes three-layer integrity verification — a reference implementation of deterministic description integrity for MCP:

  1. verify_integrity tool — LLM-callable. SHA-256 hashes of all tool definitions, deep-frozen at startup (MappingProxyType). Detects in-process mutation by malicious dependencies, monkey-patching, or middleware.
  2. flowscript://integrity/manifest resource — Host-verifiable. Claude Code / Cursor can verify descriptions without LLM involvement.
  3. tool-integrity.json — Build-time root of trust. Generated via flowscript-mcp --generate-manifest, ships in the package.

Both the Python and TypeScript MCP servers implement this architecture. Honest threat model: detects in-process mutation, not supply chain or transport-layer attacks. Full discussion →


Comparison

FlowScript Mem0 Vector stores
Find similar content Vector search Vector search Vector search
"Why did we decide X?" why() — typed causal chain
"What's blocking?" blocked() — downstream impact
"What tradeoffs?" tensions() — named axes
"What if we change this?" whatIf() — impact analysis
Contradictions RELATE — both sides preserved DELETE — replaced N/A
Audit trail SHA-256 hash chain
Temporal graduation Automatic 4-tier
Token budgeting 4 strategies

Under the hood: a local semantic graph with typed nodes, typed relationships, and typed states. Queries traverse structure — no embeddings required, no LLM calls, no network. Sub-ms on project-scale graphs. Vector search and reasoning queries are orthogonal — use both.


Ecosystem

Package What Install
flowscript-agents Python SDK — 9 adapters, unified memory, consolidation, audit trail pip install flowscript-agents openai
flowscript-core TypeScript SDK — Memory class, 15 tools, token budgeting, audit trail npm install flowscript-core
flowscript.org Web editor, D3 visualization, live query panel Browser

1,315 tests across Python (584) and TypeScript (731). Same audit trail format and canonical JSON serialization across both languages.

Docs


MIT. Built by Phillip Clapham.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flowscript_agents-0.2.6.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flowscript_agents-0.2.6-py3-none-any.whl (134.9 kB view details)

Uploaded Python 3

File details

Details for the file flowscript_agents-0.2.6.tar.gz.

File metadata

  • Download URL: flowscript_agents-0.2.6.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for flowscript_agents-0.2.6.tar.gz
Algorithm Hash digest
SHA256 ccb4665b263be77a0505bdb330ed2c056980908d56e3df760ee441f85a7059bb
MD5 af452845e05e57a35b162e9af2325780
BLAKE2b-256 a94ead4a3699702d4fa0f19fb8dbcb6b882e70bc91e8ecae477f4d2ebc627322

See more details on using hashes here.

File details

Details for the file flowscript_agents-0.2.6-py3-none-any.whl.

File metadata

File hashes

Hashes for flowscript_agents-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 658731ed82f0ad2e4eeb8905c800ee0afa3e6d8c60ccb2821a9b50fa20191831
MD5 6a8d20b1b9f5bd48548f21a82e4fd5ff
BLAKE2b-256 1e58b0eb2f8be0f61c6aa12204632088ae146b669aa1c1d5d8a6d52641028423

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page