Skip to main content

Shared reasoning memory layer for multi-agent systems

Project description

hivememory

Shared reasoning memory for multi-agent systems.

When multiple AI agents research the same problem independently, they waste tokens re-deriving the same knowledge and produce contradictory conclusions no one catches. hivememory gives agents a shared memory layer where they store structured reasoning artifacts, reuse each other's work, and surface contradictions automatically.

Project page (coming soon)

Results

Benchmark: 3 agents research "Competitive Landscape of AI Code Editors in 2026" using gpt-4o-mini, with and without shared memory. Each agent researches 3 sub-topics. In the shared configuration, agents query hivememory before each LLM call — when prior findings exist, the agent receives a focused prompt that avoids redundant research.

Metric Baseline (no shared memory) hivememory
Total tokens consumed 11,896 9,810 (-17.5%)
Memory-augmented queries 0 / 9 5 / 9
Output quality (LLM-as-judge, avg 3 runs) 8.8 9.0
Contradiction-free score 9.0 9.3
Reuse rate 0% 56%
Wall clock time 113.5s 101.9s

Token savings come from agents 2 and 3 receiving memory context that produces shorter, non-redundant LLM responses. Quality is equal or slightly better because memory-augmented agents build on verified findings rather than re-deriving from scratch.

Token consumption per agent Agents 2 and 3 use fewer tokens when prior findings are available in memory.

Total token consumption

Quality comparison LLM-as-judge scores across 4 dimensions, averaged over 3 evaluation runs.

Architecture

  agent-1 ──┐                          ┌── conflict detection
  agent-2 ──┼── hivememory API ────────┼── embedding search (FAISS)
  agent-3 ──┘    write / query /       └── provenance DAG
                 resolve / export
                       │
                 ┌─────┴─────┐
                 │  sqlite   │
                 │  + FAISS  │
                 │   index   │
                 └───────────┘

Artifact reuse flow How artifacts flow between agents. Agent 1 writes findings; agents 2 and 3 query memory, reuse relevant work, and focus on gaps.

Provenance DAG Dependency graph of artifacts. Colors indicate source agent. Edges show "built on" relationships.

Quickstart

pip install hivememory
from hivememory import HiveMemory, Evidence

hive = HiveMemory()

# store a finding
art = hive.write(
    claim="Voice AI market projected to reach $50B by 2028",
    evidence=[Evidence(source="industry report", content="35% CAGR", reliability=0.9)],
    confidence=0.85,
    agent_id="researcher-1",
)

# query shared memory before doing new research
existing = hive.query("voice AI market size", top_k=3)

# check for contradictions
open_conflicts = hive.get_conflicts()

# resolve
if open_conflicts:
    hive.resolve_conflict(open_conflicts[0].id, winner_id=art.id,
                          reason="stronger evidence", resolved_by="supervisor")

How it works

Reasoning artifacts

Agents store structured claims with evidence, confidence scores, and provenance links — not raw text. Each artifact records who produced it, what evidence supports it, and which prior artifacts it builds on. This structure makes artifacts queryable, comparable, and auditable.

Conflict detection

When a new artifact is stored, hivememory computes its embedding and searches FAISS for similar existing claims. If two artifacts are semantically close but have divergent confidence scores, a conflict is flagged. This first stage can be followed by an LLM contradiction check (OpenAI or Anthropic) for higher-precision detection.

Provenance tracking

Every artifact records its dependencies as a list of artifact IDs, forming a directed acyclic graph. This DAG answers "which agent's work did this conclusion build on?" and enables cascading invalidation — if an upstream artifact is superseded, downstream consumers can be notified.

Repo structure

hivememory/
  __init__.py          # public API exports
  artifact.py          # ReasoningArtifact, Evidence, Conflict dataclasses
  core.py              # HiveMemory main class (FAISS + sqlite)
  store.py             # low-level persistence layer
  conflicts.py         # ConflictDetector with LLM client support
  provenance.py        # ProvenanceTracker DAG
  wiki.py              # WikiExporter — markdown knowledge base export
examples/
  basic_usage.py       # store, query, conflict detect, resolve, export
  research_task.py     # 3-agent research demo with full pipeline
benchmarks/
  real_benchmark.py    # real LLM benchmark (gpt-4o-mini)
  generate_charts.py   # generate all charts from results.json
  results.json         # raw benchmark data
  results_summary.md   # human-readable summary
tests/
  test_artifact.py     # artifact serialization and ID generation
  test_store.py        # persistence layer tests
  test_conflicts.py    # conflict detection tests
  test_provenance.py   # provenance DAG tests

Examples

  • python examples/basic_usage.py — store artifacts, query memory, detect and resolve conflicts, export a wiki. Good first run to verify installation.
  • python examples/research_task.py — three agents research AI code editors, sharing findings through hivememory. Shows artifact reuse, conflict detection, provenance tracking, and wiki export end-to-end.

Token breakdown Where tokens go: baseline is all original research. hivememory splits tokens between original research, focused (memory-augmented) queries, and extraction.

Setup

  • Python 3.10+
  • pip install hivememory
  • Set OPENAI_API_KEY for LLM-based conflict detection (optional -- embedding-based detection works without it)
  • Run python examples/basic_usage.py to verify

Related work

  • Yu et al., "Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead," Architecture 2.0 Workshop (UCSD/CMU), March 2026. Frames multi-agent memory as a systems problem and proposes structured memory hierarchies over flat context passing.
  • Karpathy, "LLM Knowledge Bases" (blog post, 2025). Demonstrates single-agent knowledge accumulation with structured retrieval. hivememory extends this pattern to multi-agent systems, adding conflict detection and provenance tracking across agents.

Single-agent knowledge bases work. hivememory makes them multi-agent.


MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hivememory-0.1.0.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hivememory-0.1.0-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file hivememory-0.1.0.tar.gz.

File metadata

  • Download URL: hivememory-0.1.0.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hivememory-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bd5dd158b53993f202e0798d797a159d18a714b5f85bcf832016b58e54435f51
MD5 0b1bef950895e3758b35e8e5dd6035d3
BLAKE2b-256 c41cd78dc67d5532fc42f059c4f6cf6c9a72c4641ac014cb746c233a8a87e3ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for hivememory-0.1.0.tar.gz:

Publisher: publish.yml on ridxm/hivememory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hivememory-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: hivememory-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hivememory-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 103a68214691dab33a5aeb7e79b25a506959c7e8a991f21682df44af2a1fb4a9
MD5 f51af4cb2ffe29fe20a65f2dcb6bf21a
BLAKE2b-256 0060f9b2692792c9e89b768f8f2f9dd72abeb942c30dd24ef5ec0ff2cf570712

See more details on using hashes here.

Provenance

The following attestation bundles were made for hivememory-0.1.0-py3-none-any.whl:

Publisher: publish.yml on ridxm/hivememory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page