Skip to main content

Production-grade memory infrastructure for multi-agent systems. Namespace isolation, RBAC, provenance, ranked retrieval.

Project description

Attestor

Cut your agent's token burn 21×. Two API calls.

Full-context replay re-reads the whole conversation every turn — input tokens that grow O(n²) and a bill that compounds with every session. Attestor retrieves only what's needed: flat ~200 tokens per call, 21× fewer input tokens by turn 100, 100% recall — measured across six models, open and closed.

await attestor.add(namespace, content)          # when new information arrives
facts = await attestor.recall(namespace, query) # ~200 flat tokens, always

Self-hosted, deterministic retrieval, zero LLM in the critical path. The memory layer for agent teams that need shared, tenant-isolated memory with bi-temporal replay and an auditable supersession chain.

PyPI PyPI Downloads GitHub Stars Build Evals License: MIT

pip install attestor

Using Claude Code? pipx install attestor then attestor quickstart — one command, zero questions: it brings up the local backends (Postgres + Pinecone Local + Neo4j), uses a local Ollama embedder (no cloud key), and wires the MCP server + hooks. Reverse it with attestor teardown. Or drive it from inside Claude Code via the plugin (/plugin install attestor/attestor:install-attestor). See Install for Claude Code.

pipx install attestor && attestor quickstart
Version 4.1.6 (stable; greenfield rebuild — no v3 migration path)
PyPI attestor
Import attestor
Live site https://attestor.dev/
Repo https://github.com/bolnet/attestor
License MIT

Designed and built by Surendra Singh — building auditable infrastructure for multi-agent AI, with fifteen years of production-systems discipline brought to the memory layer. Companion projects: claude-finance (Claude-powered financial analytics) · private-equity (PE × AI workshop). Reach out if you're hiring senior IC for AI infrastructure.


What it is

Attestor is a memory store for agent teams that need a shared, tenant-isolated memory with bi-temporal replay, deterministic retrieval, and an auditable supersession chain. It runs as a Python library, a Starlette REST service, or an MCP server — same API in all three.

The token math: Full-context replay is O(n²) — every turn re-reads the whole history. Attestor replaces that with O(n) targeted retrieval. Per-call context stays flat at ~200 tokens whether the agent is on turn 1 or turn 100. One Claude Opus 4 session at 100 turns: $24.15 → $1.24. Verify it yourself with context-clock.

Turn Full-context replay Attestor Reduction
t24 growing ~200 tok 5.6×
t50 growing ~200 tok 11×
t100 8,709 tok/call ~200 tok 21.5×

It is built around three claims, each grounded in code:

  1. Bi-temporal — replay any past state. Every memory has both event time (valid_from / valid_until) and transaction time (t_created / t_expired). Nothing is deleted; everything is queryable forever (attestor/temporal/manager.py:43-73, core.py:888-890).
  2. Semantic-first retrieval, no LLM in the hot path. A six-step deterministic pipeline. Same query → same ranking. Unit-testable (attestor/retrieval/orchestrator.py:1-14).
  3. Conversation ingest with auditable conflict resolution. Two-pass speaker-locked extraction, then a four-decision (ADD / UPDATE / INVALIDATE / NOOP) resolver per fact. Every supersession carries an evidence_episode_id (attestor/extraction/conflict_resolver.py:98).

Designed for

  • Multi-agent products where many LLMs write to the same memory store
  • Regulated chat systems that need point-in-time reconstruction (compliance, audit, FOIA-style queries)
  • Self-hosted deployments — your VPC, your Postgres, your Neo4j

Not designed for

  • A general-purpose vector database
  • A RAG framework with built-in chunking, reranking, and orchestration
  • An LLM agent runtime — Attestor is the memory backend; the agent loop is yours

Quick start

1. Install

pip install attestor                 # or: pipx install attestor

Or pull the container (introspection-grade image, single layer over python:3.12-slim, currently linux/amd64):

docker pull ghcr.io/bolnet/attestor:latest      # recommended — anonymous pull, mirrored to all registries below

Same image is mirrored to:

Registry Pull address
GHCR ghcr.io/bolnet/attestor:latest
Docker Hub bolnet2025/attestor:latest
Quay quay.io/bolnet/attestor:latest
AWS ECR Public public.ecr.aws/m6h5j7o3/attestor:latest
GCP AR us-central1-docker.pkg.dev/coral-marker-452616-n4/attestor/attestor:latest

(An internal Azure ACR mirror exists at memwright.azurecr.io/attestor but is private — Azure customers should use az acr import from one of the public registries above.)

The image's default entrypoint is attestor mcp (MCP server over stdio). For full production use, point the container at an external Postgres + Neo4j via env vars (or compose them with attestor/infra/local/docker-compose.yml); override the entrypoint to run attestor doctor, attestor api, etc.

2. Stand up the local stack — one command, zero questions

attestor quickstart

attestor quickstart does the whole local install non-interactively and prints every step: it writes ~/.attestor/{config.toml,attestor.yaml,.env}, brings up the three-role local stack in Docker, uses a local Ollama bge-m3 embedder (no cloud key), wires the Claude Code MCP server (./.mcp.json) + lifecycle hooks, and runs attestor doctor.

Prerequisites: Docker running, and Ollama serving bge-m3 (ollama pull bge-m3). quickstart runs a preflight that scans the ports/tools and tells you if anything is missing — it never prompts.

Container Role Port Purpose
Postgres 16 Document 5432 Source of truth — content, tags, entity, ts, provenance, RLS-isolated by user_id
Pinecone Local Vector 5080-5089 Dense embeddings, per-namespace isolation, plain gRPC (no HTTPS)
Neo4j 5 + GDS Graph 7687 Entity nodes + typed edges, PageRank / BFS / Leiden

To reverse it later: attestor teardown (zero-question; keeps your data volumes by default — --purge also wipes them, --dry-run previews).

In Claude Code, drive the same install conversationally: /plugin marketplace add bolnet/attestor/plugin install attestor (then enable it), and run /attestor:install-attestor — it runs attestor quickstart for you. Cloud/managed backends (Neon / RDS / Cloud SQL, Pinecone Cloud, Neo4j AuraDB) and alternative embedders (Pinecone Inference llama-text-embed-v2, Voyage voyage-4, OpenAI text-embedding-3) are configured in ~/.attestor/attestor.yaml (the single source of truth) — see docs/INSTALL.md.

attestor doctor (run automatically at the end, or any time) checks all four subsystems: Document Store (Postgres), Vector Store (Pinecone), Graph Store (Neo4j), Retrieval Pipeline. The only hard dependency that cannot be down is the document store (Postgres); transient vector-probe failures are surfaced in the response trace rather than swallowed (retrieval/orchestrator.pyvector_error field).

3. Use it

from attestor import AgentMemory, AgentContext, AgentRole

mem = AgentMemory()                  # picks up env / ~/.attestor.toml automatically

ctx = AgentContext(
    agent_id="researcher-1",
    role=AgentRole.RESEARCHER,
    namespace="acme-prod",
)

mem.add(
    content="Alice is the engineering manager",
    entity="alice",
    category="role",
    context=ctx,
)

results = mem.recall(query="who runs engineering?", context=ctx)
for r in results:
    print(r.score, r.memory.content)

SOLO mode (zero-config). In v4, AgentMemory().add('foo') auto-provisions a singleton local user, an Inbox project (metadata.is_inbox=true), and a daily session — so the snippet above works on a fresh database without configuring identity (core.py:179-209). For multi-tenant production use, pass an explicit AgentContext with a real namespace.

4. Run a smoke benchmark (optional)

Verify your install end-to-end against a tiny LongMemEval slice. Defaults come from configs/attestor.yaml: Pinecone Inference llama-text-embed-v2 (1024-D) embedder + Pinecone vector store, openai/gpt-5.5 answerer, dual judges (openai/gpt-5.5 + anthropic/claude-sonnet-4-6), parallel=2.

set -a && source .env && set +a   # OPENROUTER_API_KEY, PINECONE_API_KEY, NEO4J_PASSWORD
.venv/bin/python scripts/lme_smoke_local.py --n 2 --yes

Every model and parameter comes from YAML — see § Benchmarking below for the full bench harness.


Benchmarking

Every benchmark — smoke, single slice, full sweep, synthetic supersession — reads its knobs from two YAMLs:

File What lives there
configs/attestor.yaml Stack — embedder, models, retrieval features, DBs, registries, clouds
configs/bench.yaml Bench-only — variants, category iteration order, target scores, output paths

The two files must have disjoint keys. The CI test tests/test_config_no_duplicate_keys.py enforces this; the bench loader (attestor.bench_config.get_bench) crashes on overlap. If you want a one-off override (different model for one bench run), use an env var or CLI flag — never duplicate the key in bench.yaml.

What LongMemEval is

LongMemEval (Wu et al., 2024 — published at ICLR '25) is the canonical benchmark for memory-augmented chat assistants. It measures whether an AI system can correctly answer questions that require recalling facts from long, multi-session conversation histories — the exact scenario Attestor is built for.

500 questions, 6 reasoning categories, 3 haystack sizes. Same questions across all three sizes; only the noise around the answer-bearing session changes:

Variant Tokens / Q Sessions What it measures
oracle ~3-15k 1-3 gold Reasoning ceiling — what the answerer can do with perfect retrieval. If you score low here, your prompt or LLM is broken (retrieval can't help).
s (Standard / Small) ~115k ~50 Public leaderboard — the canonical comparison. Fits in a single Claude/GPT context window, so Attestor's retrieval is benchmarked against the "just stuff everything into long context" baseline.
m (Plus / Medium) ~1M+ ~500 Pure retrieval — too big for any context window. Memory layer is forced; no long-context shortcut available.

LME-S is the headline number to beat. A memory layer that scores within 5% of a long-context baseline at 30× lower token cost is the marketing pitch.

The 6 reasoning categories (cleaned LME-S, 500 questions total — note: no abstention slice in the cleaned split, which the synthetic supersession suite covers):

Category N What it tests
multi-session 133 Fact spans across multiple sessions — must track an entity over time
temporal-reasoning 133 Date arithmetic ("two weeks ago", "before X") — Attestor's bi-temporal layer is built for this slice
knowledge-update 78 Supersession — newer fact must beat older fact when both exist
single-session-user 70 One session, fact stated by the user
single-session-assistant 56 One session, fact stated by the assistant
single-session-preference 30 One session, user preference

Why this benchmark for Attestor: the temporal-reasoning and knowledge-update slices directly exercise features that distinguish Attestor from a vanilla RAG: bi-temporal recall, supersession-on-contradiction, event-time vs transaction-time disambiguation. A high score on those slices is the regulated-AI / audit / compliance pitch.

For the published Attestor numbers, see docs/bench/ — bench artifacts persist as lme-{variant}-{category}-{date}.{report,summary}.json. The Reporting section below shows how to render them as a table.

Download the LongMemEval dataset (one-time, before any bench run)

All lme_*.sh scripts use the cleaned LongMemEval split published on HuggingFace by xiaowu0162/longmemeval-cleaned. It auto-downloads on first use, but you'll want to know what's happening.

Cache location (created on first call):

~/.cache/attestor/longmemeval/

(Or $XDG_CACHE_HOME/attestor/longmemeval/ if you set XDG_CACHE_HOME.)

Variants and on-disk sizes:

Variant Filename Size Tokens / Q Use
oracle longmemeval_oracle.json ~5 MB ~3-15k Reasoning ceiling — cheapest smoke
s longmemeval_s_cleaned.json ~250 MB ~115k Public leaderboard (canonical)
m longmemeval_m_cleaned.json ~2 GB ~1M+ Forces retrieval (no long-context shortcut)

Option A — auto-download (recommended)

Just run any bench command. The first call downloads and caches; every subsequent call reads from disk:

# Will download longmemeval_oracle.json (~5 MB) the first time
.venv/bin/python scripts/lme_smoke_local.py --n 2 --yes --variant oracle

# Will download longmemeval_s_cleaned.json (~250 MB) the first time
scripts/bench/lme_run.sh knowledge-update

You only pay the download cost once per variant. Internet flake during the first run? Delete the partial file in the cache dir and rerun.

Option B — pre-warm the cache (offline / CI)

Pre-fetch every variant you plan to use before the bench day:

.venv/bin/python -c "
from attestor.longmemeval import load_or_download
for v in ('oracle', 's', 'm'):
    samples = load_or_download(variant=v)
    print(f'{v}: {len(samples)} samples')
"

Expected output:

oracle: 500 samples
s: 500 samples
m: 500 samples

Option C — manual download (firewalled environments)

If your runner can't reach huggingface.co, fetch the files on a connected machine and drop them into the cache dir manually:

mkdir -p ~/.cache/attestor/longmemeval
cd ~/.cache/attestor/longmemeval

# pick the variants you need
curl -L -o longmemeval_oracle.json \
    https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned/resolve/main/longmemeval_oracle.json

curl -L -o longmemeval_s_cleaned.json \
    https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned/resolve/main/longmemeval_s_cleaned.json

curl -L -o longmemeval_m_cleaned.json \
    https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned/resolve/main/longmemeval_m_cleaned.json

The bench harness checks for these filenames exactly — don't rename them.

Verify the dataset is loadable

After download (auto or manual), confirm the loader picks it up cleanly:

.venv/bin/python -c "
from attestor.longmemeval import load_or_download
from collections import Counter
samples = load_or_download(variant='s')
cnt = Counter(s.question_type for s in samples)
print(f'Loaded {len(samples)} samples')
for cat, n in sorted(cnt.items(), key=lambda x: -x[1]):
    print(f'  {cat}: {n}')
"

Expected for the cleaned s variant (500 questions, 6 categories — note: no abstention slice in the cleaned split):

Loaded 500 samples
  multi-session: 133
  temporal-reasoning: 133
  knowledge-update: 78
  single-session-user: 70
  single-session-assistant: 56
  single-session-preference: 30

If counts don't match, the file is truncated — re-download.

Quick smoke (≤ 1 minute, ≤ $0.10)

Confirm the pipeline runs end-to-end before committing or running anything bigger:

.venv/bin/python scripts/lme_smoke_local.py --n 2 --yes --variant oracle

oracle is the cheapest variant (gold sessions only, no distractor haystack). Schema is reapplied automatically; pass --skip-schema if you want to keep a populated DB between runs.

Single category — scripts/bench/lme_run.sh

# all 6 categories, current variant from bench.yaml (default: s)
scripts/bench/lme_run.sh

# one slice — full
scripts/bench/lme_run.sh knowledge-update

# one slice — capped at N samples (smoke)
scripts/bench/lme_run.sh knowledge-update 10

# one slice on a different variant (oracle = cheapest, m = ~1M tokens)
scripts/bench/lme_run.sh knowledge-update "" oracle

Valid --category values: single-session-user, single-session-assistant, single-session-preference, multi-session, temporal-reasoning, knowledge-update. See What LongMemEval is above for sample counts and what each category tests.

Each run persists two files:

docs/bench/lme-{variant}-{category}-{YYYYMMDD}.report.json   # full LMERunReport
docs/bench/lme-{variant}-{category}-{YYYYMMDD}.summary.json  # BenchmarkSummary

Full sweep — scripts/bench/lme_all.sh

Iterates bench.yaml's lme.categories list in order. Adding/removing slices is a YAML edit, not a script edit:

# All 6 slices, current variant
scripts/bench/lme_all.sh

# All 6 slices, capped at 10 samples each (smoke)
scripts/bench/lme_all.sh 10

# All 6 slices on Oracle variant
scripts/bench/lme_all.sh "" oracle

If one slice fails, the script logs it and moves on to the next.

Reporting — scripts/bench/lme_report.py

Aggregates every docs/bench/lme-*.summary.json into one markdown table; picks the most-recent file per (variant, category):

.venv/bin/python scripts/bench/lme_report.py                       # latest-per-slice
.venv/bin/python scripts/bench/lme_report.py --variant s           # filter to LME-S
.venv/bin/python scripts/bench/lme_report.py \
    --markdown-out docs/bench/LME-S.md                             # also write file
.venv/bin/python scripts/bench/lme_report.py --trend               # progression over time

Default mode (latest-per-slice):

| Variant | Category | Score | N | Date | Answer | Judges |
| ------- | -------- | -----:| -:| ---- | ------ | ------ |
| s | knowledge-update | 87.5% | 78 | 20260429 | openai/gpt-5.4-mini | openai/gpt-5.5, anthropic/claude-sonnet-4-6 |

Trend mode (--trend) reads docs/bench/trend.csv — one row appended per bench run (auto-populated by lme_run.sh) — and shows progression with a Δ column:

| Variant | Category | Date | N | Score | Δ | SHA | Features | Run |
| ------- | -------- | ---- | -:| -----:| -:| --- | -------- | --- |
| s | knowledge-update | 20260429 | 78 | 80.0% |       | a126e7a |               | bench |
| s | knowledge-update | 20260430 | 78 | 88.0% | +8.0  | badcf1b | multi_query   | bench |
| s | knowledge-update | 20260501 | 78 | 91.5% | +3.5  | xxxxxxx | multi_query,hyde | bench |

The Features column records exactly which retrieval/answerer flags were enabled per run, so you can see at a glance which knob produced which lift.

Retrieval + answerer feature flags

Five orthogonal features land via configs/attestor.yaml boolean flips. All disabled by default — pick one per bench run, measure the lift, decide which to ship enabled.

Flag What it does Lift Cost overhead
retrieval.multi_query rewrite question into N paraphrases, RRF-merge N+1 vector lanes +6-10% (lit.); regressed −10pp on LME-S temporal smoke 1 small LLM call + N extra vector searches per recall
retrieval.hyde event-descriptive hypothetical-document embedding (temperature=0) — embed it as a parallel vector lane +10pp measured on LME-S temporal-reasoning (30q smoke, 70%→80%→96.7% with BM25 hybrid) 1 small LLM call + 1 extra vector search per recall
retrieval.temporal_prefilter regex-detect "two weeks ago" etc; narrow event-time window before vector +1.5% (lit.); 0pp on LME-S interrogative-anchor questions Free (regex-only, no LLM)
self_consistency answerer draws K=5 samples at temperature, elects consensus +3-6% (lit.) 5× answerer cost
critique_revise answer → critique → conditional revise +3-5% (lit.) ~3× answerer worst case

multi_query and hyde are mutually exclusive in this release (multi_query wins if both flags are on with a logged warning). self_consistency and critique_revise are similarly mutually exclusive on the answerer side. Combinations across the two sides (e.g. hyde + self_consistency) are fine.

HyDE v2 prompt (attestor/retrieval/hyde.py) — generates an event-descriptive snippet rather than an answer-shape response, so the embedding lands close to source-shape conversation turns instead of question-shape queries. This is the lever that produced the +10pp measured lift on LME-S temporal-reasoning. temperature=0 is pinned so re-runs are deterministic.

Honest negative results documented abovemulti_query and temporal_prefilter did NOT generalize from their literature numbers on the LME-S temporal-reasoning slice. multi_query paraphrases stay in question-shape and RRF dilutes marginal hits; temporal_prefilter heuristic anchors don't help interrogative-form questions ("how many days ago…"). HyDE was the right tool. Per-feature methodology + diagnostic artifacts in docs/bench/pinecone-lme-temporal-diagnostic-{baseline,mq3,hyde,hyde-bm25}-20260429.json.

Cross-vector-DB diagnostic harnessexperiments/pinecone_lme_temporal_diagnostic.py runs retrieval-only LME-S diagnostics against Pinecone Local with --baseline / --multi-query / --hyde / --bm25-hybrid / --temporal-prefilter / --category flags. No answerer, no judge — pure recall@K ceiling. --skip-ingest reuses populated namespaces for fast retrieval-flag iteration (~60s for 30q vs ~50min with fresh ingest).

To benchmark a single feature: flip its enabled: true in configs/attestor.yaml, run the bench slice, compare against a same-day baseline run with everything off. The trend table will show the delta in the Δ column.

Synthetic supersession suite — python -m evals.knowledge_updates

50 hand-curated cases, 10 contradiction categories × 5 each (numeric, categorical, temporal, preference, entity, locational, intent, relational, count, status_binary). Each case ingests two sessions (Session 1 states a fact, Session 5 contradicts it) and asks a question that should resolve to the newer fact. Metric: % of cases where retrieval surfaces the new fact as top-1.

# All 50 cases — ~5 min, ~$0.50 worth of embedding calls
.venv/bin/python -m evals.knowledge_updates

# Smoke — first 5 cases
.venv/bin/python -m evals.knowledge_updates --limit 5

# Custom fixtures
.venv/bin/python -m evals.knowledge_updates --fixtures my_cases.json

Outputs:

docs/bench/knowledge-updates-{YYYYMMDD}.report.json   # per-case verdicts (new_wins | stale_wins | miss | ambiguous)
docs/bench/knowledge-updates-{YYYYMMDD}.summary.json  # aggregate score + per-category breakdown

Target score (configurable in bench.yaml): 92% new_wins. Below that, the supersession-confidence-decay weight in attestor/retrieval/scorer.py needs tuning.

Cost & runtime guide

Approximate, at reasoning_effort=high for answerer + judge, parallel=2, OpenRouter pricing:

Run N Wall time Cost
Quick smoke 2 oracle ~1 min < $0.10
knowledge-update slice 78 ~30-60 min ~$3-5
temporal-reasoning slice 133 ~50-100 min ~$5-8
Full LME-S 500q 500 ~75-180 min ~$20-30
Synthetic supersession 50 ~5 min ~$0.50 (embeddings only)

To cut costs, edit configs/attestor.yaml's models.reasoning_effort.{answerer,judge} from highmedium or low.

Configuration cheat sheet — configs/bench.yaml

bench:
  lme:
    variant: s                    # s | m | oracle
    cache_dir: ~/.cache/attestor/lme
    output_dir: docs/bench
    sample_limit: null            # null = full dataset; int = truncate
    category: null                # null = all 7; or single slice name
    categories: [...]             # iteration order for lme_all.sh
    variants_to_run: [...]        # for full size matrix

  knowledge_updates:
    fixtures_path: evals/knowledge_updates/fixtures.json
    n_cases: 50
    target_score: 0.92
    categories: [numeric, categorical, ...]

  report:
    headline_slice: abstention
    trend_csv: docs/bench/trend.csv
    markdown_path: docs/bench/LME-S.md

Architecture

Bi-temporal — replay any past state

Every memory carries two time axes:

Axis Columns Meaning
Event time valid_from, valid_until When the fact is true in the world
Transaction time t_created, t_expired When the row landed in the store

Plus a superseded_by chain. Old facts are never deleted — they remain queryable forever (attestor/temporal/manager.py:30-66).

# What did we believe on March 1?
mem.recall(query="who runs engineering?", as_of="2026-03-01T00:00:00Z", context=ctx)

# Show me everything we knew about Alice between Feb and Apr
mem.recall(query="alice", time_window=("2026-02-01", "2026-04-01"), context=ctx)

as_of and time_window propagate end-to-end through the orchestrator and document store. Auto-supersession on write is wired into core.py:add() (core.py:762, 784-785): on every add, the temporal manager finds active rows with the same (entity, category, namespace) and different content, marks them superseded, sets valid_until=now, and links superseded_by=<new_id>. Detection is rule-based string equality today.

Tenant isolation — Postgres Row-Level Security

Every tenant table (users, projects, sessions, episodes, memories, user_quotas, deletion_audit) carries a tenant_isolation_* policy keyed off the attestor.current_user_id session variable. An empty / unset value fails closed — no rows visible (attestor/store/schema.sql:311-327).

Honest disclosure. Enforcement lives in Postgres, not Python. The AgentRole enum in attestor/context.py:49-56 is metadata that flows onto memories for provenance; it does not gate operations in Python. RLS is what actually controls access. This is correct architecture for a memory backend, but worth knowing if you read the Python alone.

The retrieval pipeline — semantic-first, six steps

attestor/retrieval/orchestrator.py runs the same six steps for every query:

  1. Vector top-K — Pinecone cosine, k=50 (pgvector remains as opt-in fallback for self-contained deploys)
  2. Graph narrow — Neo4j BFS depth ≤ 2 from each candidate's entity to the question entities; affinity bonus per hop (0-hop=+0.30, 1-hop=+0.20, 2-hop=+0.10; unreachable=−0.05). Discrete, not "soft".
  3. Triples inject — typed-edge facts (uses, authored-by, supersedes) injected as synthetic memories
  4. MMR rerank — λ=0.7
  5. Confidence decay + temporal boost — recency lifts; stale, low-confidence rows fall
  6. Budget fit — greedy monotonic-by-score pack into the caller's token budget

Every call writes a JSONL trace to logs/attestor_trace.jsonl (disable via ATTESTOR_TRACE=0).

Async retrieval — lower latency without weakening audit

Independent recall steps run concurrently via asyncio.gather, but none of the eight audit invariants are relaxed. You don't trade trust for speed — you get both.

Async step Latency win Audit invariant preserved
HyDE LLM call ‖ original-question vector embed −33 % on HyDE-enabled recalls (~600 ms → ~400 ms in the simulated unit-test) A7 — generator pins temperature=0.0, same prompt + same model = same hypothetical = same RRF order. Async amplifies non-determinism risk if T > 0; we explicitly pin T=0.
Per-lane vector searches in parallel (HyDE / multi-query) proportional to N (≈ N × per-lane → max-per-lane) RRF over the lanes is deterministic given identical inputs — gather order does not corrupt rank positions (test_multi_query_async_preserves_RRF_order).
Self-consistency K-fanout (answerer side) 5× on K=5 sampling Vote consensus is order-independent; answerer-side change, doesn't touch the document store.
Vector ‖ BM25 ‖ graph candidate-fetch −20 % on baseline recalls A2 recall_started_at ceiling — every cross-store read carries the same monotonic timestamp captured at recall start. Concurrent writes that land mid-recall are simply not visible.
Graph BFS ‖ Postgres doc-fetch −50 ms typical Same ceiling.

Write-side stays sync. All add(), update(), supersession writes are explicitly non-goals for the async refactor — the audit chain depends on serial write ordering and the bi-temporal t_created order must be linearizable per row. Async is read-side only.

Trace stays reconstructable. Every event carries recall_id + monotonic seq + optional parent_event_id, so the audit dashboard renders concurrent recalls as a tree of events rather than a stream — (recall_id, seq) reconstructs causal order from the JSONL log.

Same recall(as_of=X) replay guarantee. A past recall remains byte-for-byte reproducible from the bi-temporal columns + deletion_audit + the trace JSONL — async parallelism doesn't change what gets read, only when. The load-bearing test (tests/test_as_of_replay.py) is in the regression gate of every async PR.

Full design + audit-invariant matrix: docs/plans/async-retrieval/PLAN.md. Convention: every async PR ships with an audit-preservation argument and the matching invariant test (tests/async_retrieval/test_audit_invariants_under_async.py) GREEN before merge.

Three storage roles

Role Purpose Default Alternatives
Document Source of truth (content, tags, entity, ts, provenance, confidence) Postgres 16 AlloyDB, ArangoDB, DynamoDB, Cosmos DB
Vector Dense embedding per memory Pinecone (Local Docker / Cloud) pgvector, AlloyDB ScaNN, ArangoDB, OpenSearch Serverless, Cosmos DiskANN
Graph Entity nodes + typed edges Neo4j 5 + GDS Apache AGE on AlloyDB, ArangoDB, Neptune, NetworkX (Azure)

Postgres is the source of truth. Pinecone vectors and Neo4j graph are derived state, both rebuildable from Postgres — but both are required for the canonical install: vector cosine is step 1 of the retrieval pipeline, graph expansion is step 2, and conversation ingest writes typed edges. The only role that cannot be down is the document store; the orchestrator records transient vector-probe failures in the response trace (vector_error) instead of swallowing them.

Optional BM25 / FTS lane

A trigger-maintained content_tsv tsvector + GIN index lifts queries that embeddings under-recall (acronyms, IDs, rare proper nouns). Enabled when v4 schema is detected; fuses with the vector lane via Reciprocal Rank Fusion (RRF, k=60). Graceful no-op on backends without the column (core.py:122-130).


Conversation ingest

The heavyweight write path that turns conversation turns into auditable memories. core.py:ingest_round(turn) orchestrates four passes:

turn  →  extract_user_facts(user_turn)        ┐
        extract_agent_facts(assistant_turn)   ┘  → resolve_conflicts → apply

Two-pass speaker-locked extraction

attestor/extraction/round_extractor.py:216, 258 — separate prompts for user vs assistant turns. The user-turn extractor only emits facts attributable to the user; the assistant-turn extractor only emits facts the assistant introduced. Stops cross-attribution. The "+53.6 over Mem0" delta in our LongMemEval scores comes from this split.

Four-decision conflict resolver

attestor/extraction/conflict_resolver.py:40, 98 — for each newly-extracted fact, an LLM call against existing similar memories returns one of:

Decision Effect
ADD New info, no existing match — write fresh memory
UPDATE Same entity + predicate, refined value — keep existing id
INVALIDATE Old memory contradicted — mark superseded (timeline replays)
NOOP Already represented — skip

Each Decision carries evidence_episode_id. Every supersession is auditable. Failsafe: parse failure on a single fact yields ADD-by-default — better a duplicate-ish row than a silent drop.

Two write paths, two contracts. mem.add(...) runs the lightweight rule-based supersession (§Bi-temporal). mem.ingest_round(turn) runs the full four-decision pipeline. Pick ingest_round for conversational data; pick add for structured writes where you've already done the conflict reasoning.

Sleep-time consolidation

mem.consolidate() (core.py:526) re-extracts and synthesizes facts from recent episodes with a stronger model. Currently a Python-API-only call — no CLI command. Schedule it from your application (cron, systemd timer, ECS scheduled task) when you want fresher facts than the streaming extractor produces.

Reflection engine

attestor/consolidation/reflection.py runs periodic synthesis across N episodes for one user. Outputs:

  • stable_preferences — patterns appearing in 3+ episodes
  • stable_constraints — rules the user repeatedly invokes
  • changed_beliefs — preferences that shifted (old → new, with explicit invalidate)
  • contradictions_for_review — flagged for HUMAN REVIEW, not auto-resolved

The "do not auto-resolve" stance is the load-bearing piece for regulated chat systems. The prompt is explicit (reflection.py:35-66): "Do NOT auto-resolve contradictions. Flag them for human review."

Chain-of-Note reading

recall() returns a list. recall_as_pack() returns a typed retrieval envelope an agent can actually reason about — every field a Chain-of-Note flow needs to cite, abstain, or pick the right validity window when memories conflict:

pack = mem.recall_as_pack(query="who runs engineering?", context=ctx)

for entry in pack.memories:
    print(entry.id,                    # cite this in the answer
          entry.confidence,            # weight or abstain
          entry.valid_from,            # bi-temporal window for conflict resolution
          entry.valid_until,
          entry.source_episode_id)     # provenance back to the round it came from

agent.send(pack.render_prompt())       # Chain-of-Note prompt, memories interpolated as JSON

ContextPack is frozen=True, hashable, JSON-serializable. It drops cleanly into a tool call. The default prompt has explicit ABSTAIN and CONFLICT clauses — every frontier model defaults to confabulation otherwise.


Multi-agent primitives

Six roles

AgentRole: ORCHESTRATOR, PLANNER, EXECUTOR, RESEARCHER, REVIEWER, MONITOR (attestor/context.py:49-56). The role flows onto every memory's metadata for provenance. Access enforcement is two-layer:

  • AgentContext layerROLE_PERMISSIONS matrix gates writes / forgets per role. Matrix: ORCHESTRATOR = R+W+F; PLANNER / EXECUTOR / RESEARCHER = R+W; REVIEWER / MONITOR = R only. read_only=True is an independent kill switch.
  • Postgres RLS layer — row-level filter on user_id (see §Tenant isolation).

AgentContext — handoff, scratchpad, trail

orchestrator = AgentContext.from_env(agent_id="orchestrator", namespace="project:acme")
planner      = orchestrator.as_agent("planner",  role=AgentRole.PLANNER)
executor     = planner.as_agent("executor",      role=AgentRole.EXECUTOR)

# Each child carries parent_agent_id + accumulating agent_trail.
# All three share the same scratchpad: Dict[str, Any] for typed handoff data.

as_agent() creates a child context with parent_agent_id, full agent_trail, and a shared scratchpad. The trail accumulates — useful for proving "this answer came from agent X who got it from agent Y."

Per-agent token budgets

AgentContext.token_budget (default 20 000) is enforced — recall() packs results greedily until the budget is exhausted (scorer.py:fit_to_budget). token_budget_used accumulates across calls in a session.

Optional write quotas

mem.set_quota(user_id, daily_writes=...) → enforced on add against the v4 user_quotas table (core.py:592-621). Optional; unset means unlimited.


Security & Compliance

Row-Level Security

Cross-link to §Tenant isolation. RLS policies are the access-control surface; the Python layer trusts them. Set attestor.current_user_id per connection.

Provenance on every memory

Every memory carries agent_id, session_id, source_episode_id. The supersession chain (superseded_by) is preserved forever. Conversation episodes are stored verbatim, separate from the memories extracted from them — meaning you can always reconstruct which conversation turn produced which fact.

Deletion audit log

Hard deletes (e.g., GDPR purges) write a row to deletion_audit before the cascade — what was deleted, when, why, by whom. This is the carve-out for the otherwise-immutable schema.

GDPR — export and purge

mem.export_user(external_id="user-42")     # full data export (memories + episodes + sessions + projects)
mem.purge_user(external_id="user-42",      # cascading hard delete with audit trail
               reason="GDPR right-to-erasure request 2026-04-27")
mem.deletion_audit_log(limit=100)          # forensic readback

core.py:557-590. v4 only. Returns / writes everything Subject Access requires for Art. 15 / Art. 17.

Optional: Ed25519 provenance signing

Enable via config (signing.enabled = true). On every add, attestor signs the canonical payload id || agent_id || t_created || content_hash with an Ed25519 key. mem.verify_memory(memory_id) returns bool (core.py:623-640). Optional, off by default — turn on for adversarial-write contexts where you need cryptographic non-repudiation.


Runtime topologies

Same API across all three. Only configuration changes.

Mode Shape When to use
A — Embedded library AgentMemory(config) in-process; talks directly to Postgres + Neo4j Single-process agents, scripts, notebooks
B — Sidecar attestor api on localhost:8080; language-agnostic HTTP client shares the same Postgres + Neo4j Polyglot agents on one box (Python + TS + Go)
C — Shared service One Attestor service in front of an agent mesh (App Runner / Cloud Run / Container Apps) backed by managed Postgres + Neo4j Production multi-agent platforms
attestor api    --port 8080         # Mode B / C — Starlette ASGI REST (HTTP)
attestor mcp    --path ~/.attestor  # MCP stdio server (zero-config; for Claude Desktop / Cursor / Windsurf)
attestor serve  ~/.attestor         # MCP stdio server (positional-path variant; equivalent transport)

Backends

Backend Document Vector Graph Status
Postgres + Neo4j (default) pgvector Neo4j + GDS Production-ready
ArangoDB Production-ready (one engine, all 3 roles)
AWS DynamoDB OpenSearch Serverless Neptune Backend code + Terraform shipped
Azure Cosmos DB Cosmos DiskANN NetworkX (in-process) Backend code shipped, Terraform forthcoming
GCP AlloyDB AlloyDB ScaNN AGE on AlloyDB Backend code shipped, Terraform forthcoming

Override the default via config:

# ~/.attestor.toml
backend = "postgres+neo4j"   # or "arangodb" | "aws" | "azure" | "gcp"

Reference Terraform lives under attestor/infra/.


Embeddings

Provider auto-detect (attestor/store/embeddings.py:get_embedding_provider), in this order:

  1. Local Ollama bge-m3 — 1024-D, 8K context — used when http://localhost:11434 is reachable
  2. Cloud-native — Bedrock Titan / Vertex / Azure OpenAI when their SDK + creds are present
  3. OpenAI text-embedding-3-large (3072-D native; pin OPENAI_EMBEDDING_DIMENSIONS=1024 for schema compat)
  4. OpenRouter — for federated runs

Local-first by design. Override:

export ATTESTOR_DISABLE_LOCAL_EMBED=1            # skip the Ollama probe entirely
export ATTESTOR_EMBEDDING_PROVIDER=openai
export ATTESTOR_EMBEDDING_MODEL=text-embedding-3-large

CLI

attestor --help lists everything. The most useful commands:

Command Purpose
attestor quickstart Zero-question local install — backends + config + MCP/hooks + doctor
attestor teardown Reverse quickstart (containers + config + MCP/hooks; --purge wipes data)
attestor init Create a starter store config (lower-level; quickstart is the easy path)
attestor doctor Health-check every store + the retrieval pipeline
attestor add / recall / search / list CRUD-ish memory ops
attestor timeline Entity timeline (uses bi-temporal manager)
attestor stats Store statistics
attestor export / import JSON dump / restore
attestor compact Remove archived memories
attestor update / forget Mutate / archive a memory
attestor inspect Inspect raw database state
attestor api Start the Starlette REST API
attestor serve <path> Start MCP stdio server (positional-path variant)
attestor mcp [--path …] Start MCP stdio server (zero-config; default for Claude Desktop / Cursor / Windsurf)
attestor ui Read-only browser UI for the store
attestor hook {session-start, post-tool-use, stop} Run a Claude Code lifecycle hook
attestor lme / locomo / mab Built-in benchmark runners (see §Evaluation)

MCP server

attestor mcp (or attestor serve <path>) exposes an MCP stdio server with eight tools:

Tool Purpose
memory_add Write a memory with provenance
memory_get Fetch one memory by id
memory_recall Run the full retrieval pipeline
memory_search Filtered list (entity / category / time / namespace)
memory_forget Archive a memory by id
memory_timeline Chronology for an entity
memory_stats Store statistics
memory_health Per-role health snapshot — call this first when integrating

Plus MCP resources (memory listings) and prompts (canned recall prompts for IDE assistants).


Hooks (Claude Code)

Three lifecycle hooks ship in attestor/hooks/:

  • session_start — injects relevant memories into the session context based on cwd / repo
  • post_tool_use — auto-captures useful artifacts from Write / Edit / Bash
  • stop — writes a session summary on exit

Wire them up via the installer (next section) or by hand in ~/.claude/settings.json.


Install for Claude Code

The one command. pipx install attestor then attestor quickstart — zero questions, one default profile. It brings up the local backends, uses a local Ollama bge-m3 embedder (no cloud key), wires the MCP server (./.mcp.json) + lifecycle hooks, runs attestor doctor, and prints every step. Reverse it any time with attestor teardown.

pipx install attestor && attestor quickstart    # install (zero questions)
attestor teardown                                # uninstall (--purge also wipes data volumes)

Prerequisites: Docker running + Ollama serving bge-m3 (ollama pull bge-m3). quickstart's preflight scans for these and reports what's missing — it never prompts.

Driving it from inside Claude Code (plugin). Install the plugin once, then run the command it provides:

/plugin marketplace add bolnet/attestor     # one-time
/plugin install attestor                     # then ENABLE it in the /plugin → Installed menu
/attestor:install-attestor                   # runs `attestor quickstart` for you

Plugin commands are namespaced: the command is /attestor:install-attestor (and /attestor:uninstall-attestor), not a bare /install-attestor. A freshly-installed plugin can be disabled — enable it in the /plugin → Installed menu and /reload-plugins, or the command won't resolve.

Memory is isolated per project automatically — each working directory (git root, else cwd) is its own hard-isolated tenant, so projects never share memory. No namespace to configure.

The local backends come up as three Docker containers (the bundled attestor/infra/local/docker-compose.yml, which quickstart runs):

Container Type Storage role
attestor_postgres_document_db Postgres 16 + pgvector Document — source of truth
attestor_pinecone_vector_db Pinecone Local Vector — embeddings
attestor_neo4j_graph_db Neo4j 5 + GDS Graph — PageRank / BFS

Every container, volume, and the compose network/project is named attestor_…, so docker ps -a \| grep attestor (and docker volume ls \| grep attestor) lists everything Attestor owns.

Cloud / managed backends (Neon · RDS · Cloud SQL, Pinecone Cloud, Neo4j AuraDB) and alternative embedders (Pinecone Inference llama-text-embed-v2, Voyage voyage-4, OpenAI text-embedding-3) are configured in ~/.attestor/attestor.yaml — see docs/INSTALL.md.


Install as a Skill (2026 agent SDKs)

Attestor ships with a canonical SKILL.md at skills/attestor-memory/SKILL.md. Both Anthropic (skills-2025-10-02) and OpenAI's Responses API converged on this format — a markdown file with YAML frontmatter — for distributing reusable agent expertise. The wheel ships the SKILL.md, so every 2026-grade harness can auto-discover it after a single pip install attestor.

The skill teaches the agent the six core primitives (recall, add, timeline, current_facts, forget, audit) plus the v4 enterprise surface (bi-temporal as_of replay, RBAC roles, namespace isolation, provenance signing, GDPR export / purge). Every code example references methods that actually exist on attestor.AgentMemory, and a CI test (tests/test_skill_md.py) keeps the SKILL.md from drifting from the live API.

To pin the contract in your own host:

pip install attestor
python -c "import attestor, importlib.resources as r; print(r.files('attestor'))"   # confirm wheel installed
# Point your agent harness at the bundled SKILL.md or read it directly:
python -c "from pathlib import Path; import attestor; \
  print((Path(attestor.__file__).parent.parent / 'skills' / 'attestor-memory' / 'SKILL.md').read_text())"

Evaluation

Boundary statement. The dual-LLM judge stack is a benchmarking mechanism, not the runtime contract. Recall in production is single-pipeline and deterministic. Multiple judges score answers in evaluation only — never in user-facing reads.

Runner Source Measures
attestor lme LongMemEval (Google's long-memory benchmark) answer accuracy under long history, distillation, dual-judge cross-family
attestor locomo LoCoMo conversational long-memory consistency
attestor mab MultiAgentBench multi-agent coordination
AbstentionBench (CI gate) internal when not to answer — known unknowns
scripts/lme_smoke_local.py dual-LLM smoke quick install verification (see Quick Start §6)

The smoke driver mirrors the canonical published-benchmark stack exactly. See --help for the full env-var / CLI-flag override matrix.


Project layout

attestor/
  core.py                  -- AgentMemory (main public API)
  client.py                -- MemoryClient (HTTP drop-in for remote Attestor)
  context.py               -- AgentContext, AgentRole, Visibility
  models.py                -- Memory, RetrievalResult, ContextPack
  cli.py                   -- attestor CLI entry point
  api.py                   -- Starlette ASGI REST API
  longmemeval.py           -- LongMemEval benchmark runner (dual-judge)
  locomo.py                -- LoCoMo runner
  doctor_v4.py             -- v4 schema + invariant validator
  init_wizard.py           -- interactive install flow
  store/
    base.py                -- DocumentStore / VectorStore / GraphStore protocols
    registry.py            -- backend selection
    connection.py          -- config layering / env resolution
    embeddings.py          -- provider auto-detect (Ollama / OpenAI / Bedrock / Vertex / Azure)
    postgres_backend.py    -- pgvector (document + vector roles)
    neo4j_backend.py       -- Neo4j + GDS (graph role)
    arango_backend.py      -- all 3 roles in one
    aws_backend.py         -- DynamoDB + OpenSearch Serverless + Neptune
    azure_backend.py       -- Cosmos DB DiskANN + NetworkX
    gcp_backend.py         -- AlloyDB pgvector + AGE + ScaNN
    schema.sql             -- v4 Postgres schema (RLS, bi-temporal columns, content_tsv)
  conversation/
    ingest.py              -- ingest_round() pipeline
  extraction/
    round_extractor.py     -- 2-pass speaker-locked extraction
    conflict_resolver.py   -- 4-decision contract (ADD/UPDATE/INVALIDATE/NOOP)
    rule_based.py          -- deterministic fact extraction (no LLM)
    prompts.py             -- shared prompt templates
  consolidation/
    consolidator.py        -- sleep-time re-extraction
    reflection.py          -- cross-thread synthesis (stable patterns + flagged contradictions)
  graph/
    extractor.py           -- entity / relation extraction
  retrieval/
    orchestrator.py        -- 6-step semantic-first pipeline
    tag_matcher.py
    scorer.py              -- MMR, confidence decay, entity boost, fit-to-budget
    trace.py               -- JSONL trace writer
  temporal/
    manager.py             -- timelines, supersession, contradiction detection, as_of replay
  identity/
    signing.py             -- Ed25519 provenance signing (optional)
    defaults.py            -- SOLO mode auto-provisioning
  mcp/
    server.py              -- MCP server (tools, resources, prompts)
  hooks/
    session_start.py
    post_tool_use.py
    stop.py
  ui/
    app.py                 -- Starlette read-only viewer
    static/, templates/    -- Evidence Board UI
  utils/
    config.py, tokens.py
  infra/
    local/                 -- Docker Compose (Postgres + Neo4j)
    aws_arango/            -- Reference Terraform
tests/                     -- Unit tests; live cloud tests env-gated
evals/                     -- LongMemEval / LoCoMo / MultiAgentBench / AbstentionBench harnesses
docs/                      -- Architecture notes, ADRs
commands/                  -- /install-attestor, etc.
scripts/                   -- lme_smoke_local.py, etc.

Development

poetry install
poetry run pytest tests/ -q                          # unit tests, no external services needed
ATTESTOR_LIVE_PG=1 poetry run pytest tests/live -q   # live integration (env-gated)

Style: black formatting, isort imports, ruff lint, mypy types. PEP 8, type-annotated signatures, dataclasses for DTOs. Many small files (200–400 lines typical, 800 max).

Conventions worth knowing:

  • Postgres is the source of truth. Neo4j is derived; rebuild it from Postgres if it drifts.
  • Non-fatal errors in vector / graph paths are caught and logged. The document path never silently breaks.
  • Configuration layering: env vars → ~/.attestor.toml → in-code overrides.
  • Two write paths: add() for structured (lightweight rule-based supersession), ingest_round() for conversational (full 2-pass + 4-decision contract).

Health check

Always call this first when integrating:

attestor doctor                  # CLI
mem = AgentMemory()
print(mem.health())              # Python API
// MCP
{ "tool": "memory_health" }

It probes Document Store (Postgres), Vector Store (pgvector), Graph Store (Neo4j), and the retrieval pipeline. All four are required for the default topology — graph expansion is step 2 of the canonical pipeline, not an optional accelerator. Transient vector-probe failures surface in the recall() trace (vector_error) so callers can distinguish a degraded result from a clean one.


Status & versioning

  • Version: 4.1.6 (stable) — published to PyPI and the MCP Registry as io.github.bolnet/attestor. pip install attestor returns the latest 4.1.x (no --pre flag needed).
  • v3 → v4: greenfield rebuild on a v4-native Postgres schema with hard tenant isolation, bi-temporal facts, and a no-LLM retrieval critical path. There is no automated migration. v3 was alpha-only with no production users; drop your v3 DB and reinstall.
  • See CHANGELOG.md for the full track-by-track changelog.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

attestor-4.1.11.tar.gz (493.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

attestor-4.1.11-py3-none-any.whl (593.3 kB view details)

Uploaded Python 3

File details

Details for the file attestor-4.1.11.tar.gz.

File metadata

  • Download URL: attestor-4.1.11.tar.gz
  • Upload date:
  • Size: 493.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for attestor-4.1.11.tar.gz
Algorithm Hash digest
SHA256 8b547b3c18ba345cbd1d39ed98f125375d82202e973c7891b7911d596a08b1b0
MD5 32c1f803e7c204f7ab36fc5ac051b8bf
BLAKE2b-256 74f2ea18d3172b1ce092f042d05077dc36f581ba314ae6fc4c766c8af13b7c73

See more details on using hashes here.

Provenance

The following attestation bundles were made for attestor-4.1.11.tar.gz:

Publisher: workflow.yml on bolnet/attestor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file attestor-4.1.11-py3-none-any.whl.

File metadata

  • Download URL: attestor-4.1.11-py3-none-any.whl
  • Upload date:
  • Size: 593.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for attestor-4.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 5df52dcd87a8b03b6eb807f1697c0f39dbb21ba97b2ef50a2e135a039aa45b64
MD5 da2a0d7f0a119229b061aae6e309f31b
BLAKE2b-256 7e249d0f32a17cbc241e43b295d9d4f6ddce5871380d20d7c5bffb216923b5c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for attestor-4.1.11-py3-none-any.whl:

Publisher: workflow.yml on bolnet/attestor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page