ANCHOR, Agent-Native Canvas to Help Organize Resources with PDF ingest, FMU simulation, and source-grounded provenance

These details have not been verified by PyPI

Project description

ANCHOR

Agent-Native Canvas to Help Organize Resources
Source-Grounded Knowledge Canvas for Traceable Engineering Document Extraction

ANCHOR is a tool that lets you and your agent work with engineering documents.

Drop a PDF onto a canvas. The agent reads it and pulls the values you need into a spec table. Every value links back to the page and bounding box it came from, so you can click and see the source.

Drop FMU simulation models onto the same canvas and wire the extracted values into their parameters.

It runs on your laptop. A project is a folder: run anchor init in it, and its documents and canvases live in a hidden .anchor_data/ right there. Agents talk to it over MCP, so it works with Claude Code, Cursor, Claude Desktop, or any MCP client. There's an HTTP API and a CLI too.

First five minutes: docs/getting-started/tutorial.md.

Install

Two paths, depending on whether you want to use ANCHOR or hack on it.

Use it (from PyPI)

uv tool install anchor-kb

anchor and anchor-mcp are now on your PATH globally. The wheel includes the prebuilt frontend, so no Node toolchain is required to just run it.

If you want LLM-backed gold region extraction on your first PDF upload, create a .env file before starting ANCHOR; see Enable gold region extraction. Installation itself does not require an API key.

anchor serve              # -> http://127.0.0.1:8002

Requires Python >= 3.12. CI tests Linux and runs CLI smoke checks on macOS and Windows; verify browser and PDF workflows on your target platform.

If you prefer plain pip:

pipx install anchor-kb
# or, in a virtualenv:
pip install anchor-kb

Optional extras:

Extra	Install	Adds
`fmus`	`uv tool install 'anchor-kb[fmus]'`	FMU simulation runtime (`fmpy`). Without it, FMU tools fail closed unless you opt into the synthetic demo with `ANCHOR_FMU_DEMO=1`.

Hack on it (from source)

git clone https://github.com/Novia-RDI-Seafaring/anchor
cd anchor
uv sync --extra dev          # adds pytest, ruff, import-linter
pnpm --dir web install

Start the backend in one terminal:

uv run anchor serve

Start the frontend development server in a second terminal:

pnpm --dir web dev

Open http://localhost:5173 for the development UI. The backend remains on http://127.0.0.1:8002.

Source development requires Node.js 20+ and pnpm 10. If pnpm is not installed globally, use the Corepack form instead: corepack pnpm@10 --dir web install and corepack pnpm@10 --dir web dev.

For normal use, run anchor init in a project folder. It writes an anchor.toml marker and a hidden .anchor_data/ there, and binds the project to an environment (the provider and data zone). Commands run inside the folder then resolve it automatically. Configuration precedence is: explicit flags, ANCHOR_* environment variables, the project anchor.toml, the environment env.toml, then built-in defaults.

Releases are tag-driven: pushing a v* tag triggers the release workflow, which publishes to PyPI via OIDC trusted publishing (no token sits in the repo). See PUBLISHING.md for the full release process.

Quick start

To produce gold regions, configure a .env file before running anchor demo or anchor ingest; see Enable gold region extraction. Without it, ANCHOR still produces the silver document extraction.

# 0. One-shot: seed a `demo` workspace with six placeholder spec slots
#    and start the server. If you already have the optional local demo PDF,
#    ANCHOR ingests it too; otherwise ingest your own PDF in step 3.
anchor demo

# Or step by step:

# 1. Start a project in a folder. The first time, `anchor init` asks you to
#    pick a provider — this is the ENVIRONMENT, the trust boundary that decides
#    where document content may go (local keeps everything on your machine).
#    Pass --provider to skip the prompt, e.g. `anchor init --provider local`.
mkdir my-anchor-project
cd my-anchor-project
anchor init
anchor canvas create my-first-canvas

# 2. Start the server (serves the canvas UI, HTTP API, and browser SSE updates)
anchor serve

# 3. (in another terminal) Ingest a PDF
anchor ingest /path/to/datasheet.pdf

# 4. Open http://localhost:8002/c/my-first-canvas in your browser

See docs/getting-started/tutorial.md for a walked-through anchor demo -> "agent fills the placeholders" tour.

That's the whole loop. Every PDF you ingest becomes a structured set of regions on disk; every canvas you create is a folder you can zip and email.

Using ANCHOR with an AI agent

ANCHOR exposes its tools over MCP (Model Context Protocol). For Claude Code, the quickest path is the plugin marketplace. It needs no prior install of the anchor CLI, only uv:

/plugin marketplace add Novia-RDI-Seafaring/anchor
/plugin install anchor@anchor

The plugin registers the MCP server (via uvx --from anchor-kb anchor-mcp) and the anchor skill in one step. See the Claude Code plugin guide for details.

Alternatively, with the CLI already installed, register the local stdio server with:

anchor install claude-code

Pick one of the two paths, not both.

Open Claude Code inside a folder configured with anchor init. In any conversation, run /mcp and you should see anchor listed with its available tools. The exact list depends on optional extensions such as FMU support. Then talk normally:

"Ingest the PDF at ~/Downloads/lkh-pump.pdf and create a canvas called pump-analysis with a document node for it."

"What does the document say about max inlet pressure for the LKH-5 at 50 Hz? Place the answer as a fact card on the pump-analysis canvas, with an evidence edge back to the source page."

Claude calls the MCP tools directly. Your browser tab on localhost:8002/c/pump-analysis, if open, sees nodes appear live via SSE. Multi-client real-time sync between agents and humans is the default.

For Cursor:

anchor install cursor

anchor install claude-code / cursor wire the MCP server for the default environment. For Claude Desktop, or to serve a specific environment, use anchor install claude-desktop --env <name>; it writes a named entry, echoes the data zone before wiring, and is collision-safe. See the agent setup guide.

See the agent configuration guide for Codex, OpenCode, Cursor, Claude Code, and generic stdio examples.

Where data lives

A project is a folder. Its corpus and canvases live in a hidden .anchor_data/ beside an anchor.toml marker. Everything is plain files: tar it, mail it, diff it in git.

your-project/
├── anchor.toml             # binds this folder to an environment (provider + data zone)
└── .anchor_data/
    ├── bronze/<slug>/      # raw PDFs (your originals)
    ├── silver/<slug>/      # Docling extraction + per-page markdown + page PNGs
    ├── gold/<slug>/        # structured regions with page + bbox provenance
    └── canvases/<slug>/    # meta.json, state.json, events.jsonl (append-only log)

This layout is the contract. You can hand-edit the JSON, copy a canvas folder to another machine, or version-control the whole project. The file-level detail is in On-disk substrate.

A project created by an agent (no working folder) is managed under its environment at ~/.anchor/envs/<env>/projects/<name>/, with the same .anchor_data/ inside. A pre-existing ~/anchor-data from older versions keeps working until you run anchor migrate.

Configuration

Provider, model, and data-zone settings live in an environment's env.toml, created with anchor env create <name>. Run anchor init in a folder to start a project bound to an environment; it writes an anchor.toml marker (and a hidden .anchor_data/) there, with any per-project overrides going in that marker. The first time, if no environment exists yet, anchor init asks you to pick a provider (your data zone), or you pass --provider. It never picks a trust boundary for you silently. See Environments and projects for the full model. Select the server bind address with the CLI flags --host and --port. The following ANCHOR_ environment variables override the resolved settings:

Variable	Default	Purpose
`ANCHOR_OPENAI_API_KEY`	(unset)	Optional: enables LLM polish + region extraction in the gold layer. Required for Azure and custom endpoints.
`ANCHOR_OPENAI_BASE_URL`	(unset)	Override the OpenAI-compatible endpoint. For Azure OpenAI v1 use `https://<resource>.openai.azure.com/openai/v1/`; for Ollama use `http://localhost:11434/v1`.
`ANCHOR_POLISH_MODEL`	`gpt-5.4`	Model name for page-MD polishing
`ANCHOR_REGION_MODEL`	`gpt-5.4`	Model name for region extraction
`ANCHOR_EMBED_MODEL`	`BAAI/bge-small-en-v1.5`	Local sentence-transformer model used by default for semantic search. Recorded in every `embeddings.json` so cross-model search refuses to mix vectors.
`ANCHOR_DPI`	`150`	Render DPI for page images
`ANCHOR_CORS_ORIGINS`	(unset)	Comma-separated additional origins permitted by the HTTP server

If no usable vision key is configured, ingest still produces silver (deterministic Docling extraction + per-page markdown). Gold extraction (LLM-driven structured regions) is skipped. The system stays useful without an API key: silver is the workable substrate; gold is the polish.

Enable gold region extraction

Gold regions are created during PDF ingestion only. Configure a vision-capable LLM endpoint before uploading a document or running anchor ingest. Documents already ingested as silver-only are not backfilled automatically; ingest them again after enabling a provider.

ANCHOR reads .env from the project folder where you run anchor init, then start anchor serve, anchor demo, or anchor ingest. For users installed with uv tool install anchor-kb, create that .env file in your chosen project directory before the first upload.

For OpenAI, create .env containing:

ANCHOR_OPENAI_API_KEY=<your-openai-api-key>
ANCHOR_POLISH_MODEL=gpt-5.4
ANCHOR_REGION_MODEL=gpt-5.4

For Azure OpenAI, ANCHOR currently supports the Azure OpenAI v1 endpoint through the standard OpenAI-compatible client using API-key authentication. The key must be the Azure resource key. A personal OPENAI_API_KEY in your shell is not proof that the Azure project is configured.

ANCHOR_OPENAI_API_KEY=<your-azure-openai-key>
ANCHOR_OPENAI_BASE_URL=https://<resource-name>.openai.azure.com/openai/v1/
ANCHOR_POLISH_MODEL=<vision-capable-deployment-name>
ANCHOR_REGION_MODEL=<vision-capable-deployment-name>

The Azure deployment name is used as model, not the base model name, and must support image input and JSON-formatted chat completion output. Azure Entra ID authentication and the older Azure deployment/API-version endpoint shape are not configured by ANCHOR environment variables today. See Microsoft's Azure OpenAI v1 API documentation for endpoint details.

You can let ANCHOR write the non-secret environment settings for you. Create an environment for the zone, drop the key into its gitignored .env, then bind a project folder to it:

anchor env create azure --provider azure \
  --base-url https://<resource-name>.openai.azure.com/ \
  --vision-model <vision-capable-deployment-name>
echo 'ANCHOR_OPENAI_API_KEY=<your-azure-openai-key>' >> ~/.anchor/envs/azure/.env
anchor check --env azure --probe       # confirm the deployment + key, sends no documents

cd your-project && anchor init --env azure

From the same directory as .env, start ANCHOR and upload the PDF in the UI:

anchor serve

Alternatively, ingest a file directly from the same directory:

anchor ingest "C:\path\to\datasheet.pdf" --force

Successful gold extraction writes structured regions under the project's .anchor_data/gold/<doc-slug>/ and returns a non-zero region_count when regions are identified. Verify with:

anchor list
anchor gold-map <doc-slug>

In anchor list, the document should show "has_gold": true. If it does not, check ANCHOR_OPENAI_API_KEY, the /openai/v1/ base URL, and that ANCHOR_REGION_MODEL is the Azure deployment name.

For Ollama / local-LLM recipes, see docs/guides/agent-setup.md.

Commands

Run most commands from inside a project folder and they resolve it automatically (via the anchor.toml marker). Full reference: docs/reference/cli.md.

# Environments (the provider / data-zone profile = the trust boundary)
anchor env create NAME [--provider local|ollama|openai|azure|custom] [--base-url …] [--vision-model …]
anchor env list | show NAME | default NAME | set-description NAME "…"

# Projects (a folder = a corpus + its canvases)
anchor init [NAME] [--env NAME] [--provider …]   # start a project in this folder
anchor project create NAME [--env NAME]          # a managed project under the env
anchor project list | set-description | move NAME --to ENV
anchor use ENV [PROJECT]                          # session default so you can omit --env
anchor migrate                                    # fold a legacy ~/anchor-data in
anchor check [--env NAME] [--probe] [--fix]       # audit the data zone before ingesting

# Documents + search
anchor ingest PDF_PATH [--skip-polish] [--skip-regions] [--force]
anchor list | index SLUG | regions SLUG [--page N] | page-text SLUG PAGE
anchor embed [SLUG] [--overwrite] | search "<query>" [--k N]

# Canvas + server
anchor serve [--env NAME] [--project NAME] [--host HOST] [--port PORT]
anchor demo  [--no-serve]
anchor canvas list | create SLUG [--title TITLE] | placeholders SLUG | snapshot SLUG

# Agents (write a named MCP pointer for an environment)
anchor install claude-desktop --env NAME [--name ENTRY] [--create]
anchor install claude-code [--env NAME] | cursor [--env NAME] | print

# Extensions (OIP producers) + misc
anchor extensions list | info NAME | add MANIFEST | remove NAME | discover | schema
anchor version

anchor-mcp --env NAME runs the MCP server over stdio for one environment (used by an agent's MCP harness; you don't normally invoke it yourself). --data-dir is still accepted on the document/canvas commands to point at a raw storage dir, but the project folder is the usual way in.

Architecture (one paragraph)

ANCHOR is a hexagonal modular monolith. Pure domain code in core/ (no I/O, no framework imports - enforced by lint-imports). Concrete protocol implementations in infra/. Transport adapters in adapters/ (HTTP, MCP, CLI, SSE). The Python wheel ships the React frontend bundle inside it (anchor/_web_dist/) so one process serves both the API and the UI. State changes are events, persisted to events.jsonl per canvas, broadcast to subscribers (agents on MCP, browsers on SSE). See the architecture docs.

Extensions and the Open Ingestion Protocol

ANCHOR's canvas is one OIP consumer. PDF ingestion is one OIP producer, bundled with this build. The protocol, specified at github.com/Novia-RDI-Seafaring/OIP, is governance-neutral: any tool that produces ingested knowledge in OIP shape can plug in, and any OIP-aware consumer can read its output. A transcription tool, a code-region extractor, a web crawler, or your own ingestion logic does not need to import ANCHOR. It only needs to ship an OIP manifest at a known location.

The CLI surfaces this:

anchor extensions list                        # what producers can this ANCHOR see?
anchor extensions discover                    # where does it look for manifests?
anchor extensions add <path-to-manifest.json> # register a new producer (system-wide)
anchor extensions schema                      # print a starter manifest to edit
anchor extensions info anchor-pdfs            # full manifest for one producer

Discovery, in priority order:

Per-data-dir: <data-dir>/.oip/producers.d/*.json (highest priority; bound to a specific workspace tree)
System-wide: ~/.config/oip/producers.d/*.json (any installer can drop a manifest here; visible to every OIP consumer on the machine)
Bundled: compiled into this ANCHOR wheel (anchor-pdfs, anchor-fmus, and anchor-cad; SysML tools are also exposed by the bundled MCP server)

For implementation status: today, an OIP-registered producer is visible in extensions list but ANCHOR doesn't yet spawn external producer MCP servers and proxy their tools. That's the next engineering lift. See the OIP repo for the spec and EXTENSIONS.md for ANCHOR's host-side roadmap.

Tests

uv sync --extra dev                       # one-time: install pytest/ruff/import-linter
uv run pytest                             # ~570 backend tests
uv run lint-imports                       # 6 dependency-rule contracts
pnpm --dir web test                       # ~180 web tests (Vitest)
pnpm --dir web exec tsc --noEmit          # web typecheck

The test seam is function-based pytest with in-memory implementations of every port. Real I/O tests use tmp_path. The frontend tests cover canvas primitives, the SSE event store, and the inline-edit hooks.

Status & roadmap

v0.2 (current): canvas primitive + PDF ingestion in one package, real-time SSE sync, MCP integration, folder-based projects under named environments (the data-zone / trust boundary), skill + pointer installers for Claude Code / Cursor / Claude Desktop, backend and web test suites, hexagonal contracts enforced.

Near-term: complete remaining node renderer and asset workflows, then stabilise the extension registration surface.

Mid-term: split the canvas primitive (anchor-canvas) and PDF extension (anchor-canvas-pdfs) into separately-publishable packages, and stabilise the extension contract for third-party authors.

Longer term: other ingestion extensions (audio/video transcription, code, web), shared org docs / personal canvases topology, optional Postgres event store for very large workspaces.

Security model: read before exposing

ANCHOR's HTTP server is unauthenticated by design. It edits local engineering data (workspaces, documents, FMU files) and is meant to run on your own machine.

Default bind is 127.0.0.1 (loopback). Nothing else on the LAN can reach it unless you pass --host 0.0.0.0.
CORS is restricted to the dev Vite origin (localhost:5173); set ANCHOR_CORS_ORIGINS=https://your-host for explicit overrides.
Workspace slugs and upload filenames are policy-checked and containment-asserted before they hit disk. The v2 codebase does not trust client-supplied paths.

If you want to share an ANCHOR instance on a network, add your own reverse proxy with auth in front of it (Tailscale, OAuth proxy, basic-auth nginx, ...). Don't expose the unauthenticated port directly.

Limitations (v0.2)

These extensions are intentionally rough; we ship them so you can see the shape and contribute, not as finished features:

anchor_cad: parametric-CAD producer (jscad/openSCAD) ships as a proof of concept; full feature parity with STEP/STL viewing is on the roadmap. SVG export still has a known font-handling bug.
anchor_sysml: SysML import (BSD-3-Clause fixtures from the OMG reference) and export to SVG/markdown are experimental; we'll swap the hand-rolled IR for the official Pydantic model when that lands.
anchor_fmus: FMU simulation requires fmpy (install via uv tool install 'anchor-kb[fmus]'). Without it the extension fails closed; set ANCHOR_FMU_DEMO=1 to use the synthetic-output runtime (every result is stamped synthetic=true so the UI can warn you).

License

MIT, see LICENSE.

Citation

If you use ANCHOR, please cite the software repository:

@misc{ANCHOR,
  author       = {Lamin Jatta and Christoffer Bj{\"o}rkskog and Mikael Manng{\aa}rd and Johan West{\"o}},
  title        = {ANCHOR: Agent-Native Canvas to Help Organize Resources for Traceable Engineering Document Extraction},
  year         = {2026},
  howpublished = {\url{https://github.com/Novia-RDI-Seafaring/anchor}},
}

GitHub-compatible citation metadata is provided in CITATION.cff.

Acknowledgments

This work was done in the Business Finland funded project Virtual Sea Trial.

Contributing

Open changes as short-lived branches targeting main; see CONTRIBUTING.md. Run uv run --extra dev pytest and uv run --extra dev lint-imports before pushing backend changes. See EXTENSIONS.md for the proposed third-party extension contract and its current implementation status.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.8

Jun 23, 2026

0.2.7

Jun 22, 2026

0.2.6

Jun 22, 2026

0.2.5

Jun 17, 2026

0.2.4

Jun 11, 2026

0.2.3

Jun 10, 2026

0.2.2

Jun 9, 2026

0.2.1

Jun 8, 2026

0.2.0

May 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anchor_kb-0.2.8.tar.gz (1.0 MB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

anchor_kb-0.2.8-py3-none-any.whl (797.4 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file anchor_kb-0.2.8.tar.gz.

File metadata

Download URL: anchor_kb-0.2.8.tar.gz
Upload date: Jun 23, 2026
Size: 1.0 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for anchor_kb-0.2.8.tar.gz
Algorithm	Hash digest
SHA256	`67645b101ef5595544a77f29cba105b7fae3f2a3618456a677ca8765cd6914ac`
MD5	`664054743a7a4190db5ff6cfea6b4a89`
BLAKE2b-256	`41b35e47c1c9539fd29bbc9e0c84cfcb2c48f45e75628c35e8a3756af0ec49c9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for anchor_kb-0.2.8.tar.gz:

Publisher: release.yml on Novia-RDI-Seafaring/anchor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: anchor_kb-0.2.8.tar.gz
- Subject digest: 67645b101ef5595544a77f29cba105b7fae3f2a3618456a677ca8765cd6914ac
- Sigstore transparency entry: 1925548121
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: Novia-RDI-Seafaring/anchor@b3bd7638f71e08e4e76eb0047fe059ac3e35c4ac
- Branch / Tag: refs/tags/v0.2.8
- Owner: https://github.com/Novia-RDI-Seafaring
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@b3bd7638f71e08e4e76eb0047fe059ac3e35c4ac
- Trigger Event: push

File details

Details for the file anchor_kb-0.2.8-py3-none-any.whl.

File metadata

Download URL: anchor_kb-0.2.8-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 797.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for anchor_kb-0.2.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`69875648a4584c9a810ed89833b6190bc0aaf7c0040a66f69ea429c581d9f348`
MD5	`5f8b15f2f909bffa38c8f5ba29cc06d9`
BLAKE2b-256	`75093d4060f97247aceec2788a7487f1cd0863e20871e71d6fc4fde654c7e01d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for anchor_kb-0.2.8-py3-none-any.whl:

Publisher: release.yml on Novia-RDI-Seafaring/anchor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: anchor_kb-0.2.8-py3-none-any.whl
- Subject digest: 69875648a4584c9a810ed89833b6190bc0aaf7c0040a66f69ea429c581d9f348
- Sigstore transparency entry: 1925548622
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: Novia-RDI-Seafaring/anchor@b3bd7638f71e08e4e76eb0047fe059ac3e35c4ac
- Branch / Tag: refs/tags/v0.2.8
- Owner: https://github.com/Novia-RDI-Seafaring
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@b3bd7638f71e08e4e76eb0047fe059ac3e35c4ac
- Trigger Event: push

anchor-kb 0.2.8

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

ANCHOR

Install

Use it (from PyPI)

Hack on it (from source)

Quick start

Using ANCHOR with an AI agent

Where data lives

Configuration

Enable gold region extraction

Commands

Architecture (one paragraph)

Extensions and the Open Ingestion Protocol

Tests

Status & roadmap

Security model: read before exposing

Limitations (v0.2)

License

Citation

Acknowledgments

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance