Persistent, cross-agent memory for AI coding agents — a write-guarded, hybrid-retrieval decision store.

These details have not been verified by PyPI

Project links

Project description

🐝 Hive Mind

Persistent, cross-agent memory for AI coding agents.

Every time you start a fresh AI coding session, the agent forgets everything. You re-explain "we chose Postgres over Mongo because of multi-document transactions," re-describe the architecture, re-justify decisions that were settled weeks ago. Hive Mind fixes that: it's a local, file-based memory store that any agent reads before it works and writes to after — so decisions, dead ends, and context survive across sessions and across different agents.

from hive import read_memory, write_memory

# Before working: inherit everything the last agent knew
ctx = read_memory(project="my-api", query="add rate limiting to the public API")

# After deciding: leave it for the next agent
write_memory("decision", "my-api", {
    "what": "Token-bucket rate limiting at the gateway, 100 req/s per key",
    "why":  "Sliding-window was 3x the Redis ops; token-bucket is good enough",
})

Why it's different

Most "memory" tools are a dump of embeddings you hope are relevant. Hive Mind is opinionated about quality and trust:

A write guard nothing bypasses. Every write passes 6 rules (required fields, vagueness, exact/fuzzy duplicates, contradictions, missing rationale) before it touches the store. Bad writes go to a review queue, not the bin — so even rejects are signal.
Hybrid retrieval that actually wins. TF-IDF keyword precision fused with dense semantic recall (RRF). Beats pure keyword search on every metric and holds across 54× corpus growth.
Memory that ages honestly. Confidence decays on a half-life; stale decisions fall out of the working set; re-affirming one resets its clock. Computed at read time — stored data is never silently mutated.
Provenance, not just answers. Record what you ruled out and link it to what you chose. Later: "what did we consider before this?"
It captures itself. A git post-commit hook extracts decisions straight from commit messages — gated by a quality floor, never bypassing the guard.
Local-first, zero setup. One SQLite file that survives git clone. No server, no vector DB, no cloud. Pure-stdlib core; the semantic layer is optional.

Install

Pick whichever fits your stack — all three install the same hive CLI and the importable hive Python package:

# Python (recommended)
pipx install hive-mind          # isolated, or:
pip install hive-mind

# npm (thin launcher around the Python package — needs Python 3.10+)
npm install -g hive-mind

# curl
curl -fsSL https://raw.githubusercontent.com/TejesMunde/hive-mind/main/install.sh | sh
# Windows PowerShell:
#   irm https://raw.githubusercontent.com/TejesMunde/hive-mind/main/install.ps1 | iex

The npm and curl installers bootstrap the Python package, so Python 3.10+ must be on the machine. (Zero-Python standalone binaries are a planned follow-up.)

The semantic (dense) retrieval layer is optional. Install it as an extra; without it the reader degrades silently to TF-IDF:

pip install "hive-mind[dense]"      # adds numpy + fastembed

Verify, then use it:

hive --version
hive --help

from hive import init_db
init_db()   # idempotent: creates tables + runs migrations

From source

git clone https://github.com/TejesMunde/hive-mind.git
cd hive-mind
pip install -e ".[dense]"

Quick start

from hive import init_db, read_memory, write_memory
from hive.core.writer import close_task

init_db()
project = "my-api"

# Record a decision (passes the write guard)
write_memory("decision", project, {
    "what":  "Chose PostgreSQL for the primary OLTP store",
    "why":   "ACID guarantees and JSONB fit the workload better than Mongo",
    "agent": "claude-code",
})

# Track open work
write_memory("open_task", project, {"description": "Wire up connection pooling"})

# Later (or in another agent): retrieve ranked context
ctx = read_memory(project, query="what database are we using and why")
for d in ctx["warm"]["decisions"]:
    print(d["score"], d["what"])
# ctx["hot"]  → open tasks + latest snapshot
# ctx["warm"] → decisions ranked against the query

Record what you ruled out (provenance)

from hive import write_memory, get_provenance

dec = write_memory("decision", project, {
    "what": "Migrated the queue from RabbitMQ to Kafka",
    "why":  "Need partition ordering and replay for billing events"})

write_memory("dead_end", project, {
    "what_tried":         "Evaluated RabbitMQ for the event backbone",
    "why_failed":         "No native replay; ordering guarantees were per-queue only",
    "chosen_decision_id": dec["id"]})

prov = get_provenance(dec["id"])   # {decision, dead_ends[], supersedes}

Let confidence age, re-affirm what's still true

from hive import reinforce_decision, sweep_archive
reinforce_decision(decision_id)     # +confidence, resets the decay clock, un-archives
sweep_archive(project)              # cold-archive decisions whose decayed conf < 0.25

Hand off to the next agent / route work

from hive import create_handoff, route_task

packet = create_handoff(project, from_agent="claude", to_agent="next")
# packet["state"] = open tasks + snapshot + top decisions
# packet["delta"] = what changed since the previous handoff

ranked = route_task(project, "add OAuth to the public API")
# -> [{agent, score, evidence:[...]}]  (advisory only — never auto-assigns)

Auto-capture decisions from git commits

python -m hive.cli.hook install          # idempotent post-commit hook (per repo)
python -m hive.cli.capture <sha>         # what the hook runs: extract → guard → write
python -m hive.cli.capture stats         # decisions at conf 1.0, by source, skip reasons
python -m hive.cli.capture calibrate 50  # LOG-ONLY pre-filter pass-rate + verdict
python -m hive.cli.hook uninstall        # removes only Hive's hook block

Only commits carrying decision language (chose … over, switched to, because, …) clear the floor; survivors go through the full guard at reduced confidence (0.6), tagged source='git-hook'. Sub-threshold commits are dropped and audited, never staged.

How retrieval works

query → normalize (case-fold, stopwords, stem, synonym-expand)
      → TF-IDF overlap score (smoothed IDF, headline + recency + confidence boosts)
      → hybrid rerank: pin the top TF-IDF hit, let dense embeddings reorder the head
      → pack into a token budget (hot 500 / warm 2500)

TF-IDF — keyword precision, set-overlap on smoothed IDF.
Dense — BAAI/bge-small-en-v1.5 (384-dim, 33 MB, ONNX via fastembed, no torch).
Hybrid (RRF) — fuses the two; pins the keyword #1 (confidence-gated) and lets the embeddings reorder the rest of the head. The keyword anchor is what keeps it robust at scale where dense-alone drifts to semantically-adjacent-but-wrong docs.

Benchmark

Measured against a labeled query/decision eval set (tests/eval_corpus.json):

method	Recall@1	Recall@3	MRR
TF-IDF	74.0%	83.3%	0.803
dense	60.4%	80.2%	0.721
hybrid (default)	79.2%	91.7%	0.856

Hybrid beats TF-IDF on every metric and every category, and stays flat across 54× corpus growth (a cross-encoder reranker was evaluated and rejected — worse and ~250× slower). Run it yourself:

PYTHONIOENCODING=utf-8 python tests/bench_recall.py   # tfidf vs dense vs hybrid
PYTHONIOENCODING=utf-8 python tests/bench_scale.py    # recall + latency vs corpus size

The write guard

Every write — human, agent, or git-hook — passes through hive/core/guard.py before commit. Order matters:

Required fields present and non-empty
Not vague (decisions/tasks need ≥ 5 words in the main field)
Not an exact duplicate
Not a contradiction of an existing decision (opposition markers, swapped sides)
Not a fuzzy duplicate (Jaccard token overlap ≥ 0.45)
Has a why

A flagged write isn't dropped — it goes to a staging queue for human review, or is auto-rejected only if the system learned that category is reliably wrong for this project. Review staged records:

hive staging list            # pending review
hive staging accept <id>     # promote to the store
hive staging tune            # learn auto-reject policies from history
hive audit tail              # append-only event log

Command-line interface

Once installed, everything is under one hive command:

hive recall   <project> "<query>"          # retrieve ranked context (JSON)
hive remember <project> "<what>" "<why>"   # record a decision (through the guard)
hive capture  <sha> | stats | calibrate    # extract decisions from git commits
hive hook     install | uninstall | status # post-commit capture hook
hive staging  list | accept | reject | …   # review guard-flagged writes
hive audit    tail | counts | fails        # append-only event log
hive init                                  # inject Hive usage block into agent configs
hive --version | --help

Each subcommand is also runnable directly as python -m hive.cli.<name> if you prefer not to install the console script.

Storage

A single SQLite file (hive.db, override with HIVE_DB_PATH). 10 tables:

Table	Role
`decisions`	committed long-term decisions (warm tier) + supersession, archive, source
`snapshots`	latest project structure (hot tier)
`open_tasks`	live work items (hot tier)
`dead_ends`	rejected approaches, linked to the decision that replaced them
`staging`	writes the guard flagged for review
`staging_history`	reviewer outcomes — feeds the auto-tune learner
`guard_policy`	per-project, per-category action (`stage` / `auto_reject`)
`audit_log`	append-only event stream (every write + every query)
`decision_embeddings`	cached float32 embeddings per decision
`handoffs`	persisted agent handoff packets (state + delta)

Project layout

hive/
  __init__.py        public API (read_memory, write_memory, get_provenance, …)
  db/setup.py        SQLite init + idempotent migrations
  core/
    guard.py         6 write-guard rules (never bypassed)
    writer.py        write_memory, close_task, reinforce/archive, staging promote
    reader.py        read_memory (hot + warm tiers), get_provenance
    normalize.py     tokeniser: stopwords, stemmer, synonym map
    dense.py         dense cosine + RRF hybrid fusion
    embedder.py      fastembed wrapper (bge-small-en-v1.5)
    decay.py         confidence decay + archive constants
    handoff.py       agent handoff packets (state + delta)
    routing.py       expertise routing (decay-aware, advisory)
    extract.py       pure commit → decision extractor (the quality floor)
    policy.py        per-project guard policy + auto-tune learner
    audit.py         append-only event log
  cli/
    staging.py  audit.py  init.py  capture.py  hook.py
tests/
  test_day1.py … test_day11.py   per-feature end-to-end tests
  bench_recall.py  bench_scale.py  bench_rerank.py  eval_corpus.json

Running the tests

# End-to-end feature tests (all must pass before any commit)
PYTHONIOENCODING=utf-8 python tests/test_day1.py    # … through test_day11.py

# Retrieval benchmarks (must not regress below the table above)
PYTHONIOENCODING=utf-8 python tests/bench_recall.py
PYTHONIOENCODING=utf-8 python tests/bench_scale.py

Roadmap

Phase 1 — Core memory, write guard, staging, audit, auto-tune. ✅
Phase 2 — Semantic embeddings + hybrid RRF retrieval. ✅
Phase 3 — Dead ends, decision provenance, idempotent agent global config. ✅
Phase 4 — Confidence decay, cold archive, contradiction detection v2. ✅
Phase 5 — Agent handoff packets, decay-aware expertise routing. ✅
Phase 6 — Git-commit decision extraction (quality floor + post-commit hook). ✅
Phase 7 — Distribution: PyPI + npm + curl install, unified hive CLI, MIT license. ✅
Later — zero-Python standalone binaries (PyInstaller + GitHub Releases), file watcher / daemon, vectorized TF-IDF for large corpora.

Design principles

Never bypass the write guard — one corrupt record poisons every future retrieval.
Reads are side-effect free — decay and ranking never mutate stored data, so the benchmark stays honest.
The TF-IDF fallback must always work — the dense path is strictly optional.
Staging over deletion — deleted bad data gives no signal; every staged record is feedback for the learner.
Local-first — no vector DB until records exceed the benchmarked crossover point.

License

MIT © TejesMunde.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Jul 4, 2026

0.1.1

Jul 4, 2026

This version

0.1.0

Jul 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hive_ai-0.1.0.tar.gz (44.5 kB view details)

Uploaded Jul 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hive_ai-0.1.0-py3-none-any.whl (54.1 kB view details)

Uploaded Jul 4, 2026 Python 3

File details

Details for the file hive_ai-0.1.0.tar.gz.

File metadata

Download URL: hive_ai-0.1.0.tar.gz
Upload date: Jul 4, 2026
Size: 44.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for hive_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`895c3235280a0b7a577fc2c8c348ec21f073b291a573403dfc70c94a5f6199aa`
MD5	`39764002cbc0027ab8c44693f3cf3968`
BLAKE2b-256	`622134e947f3e6dc6bf358eafa67e4893e2cf82c36689a07f4acfb91566618b8`

See more details on using hashes here.

File details

Details for the file hive_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: hive_ai-0.1.0-py3-none-any.whl
Upload date: Jul 4, 2026
Size: 54.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for hive_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1e584f607f4ebffaed0cf64707ef3adf3eda76a57107ce285111d7b0ddbe5ff2`
MD5	`d1929db9928cbbe15b2c38732bb22698`
BLAKE2b-256	`996036218ebc0136245260afff9c03c0df320f8fa156cd83364be37a6bea2fad`

See more details on using hashes here.

hive-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🐝 Hive Mind

Why it's different

Install

From source

Quick start

Record what you ruled out (provenance)

Let confidence age, re-affirm what's still true

Hand off to the next agent / route work

Auto-capture decisions from git commits

How retrieval works

Benchmark

The write guard

Command-line interface

Storage

Project layout

Running the tests

Roadmap

Design principles

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes