Pythonic ArchiveBox API Wrapper and Fast MCP Server for Agentic AI use!
Project description
Archivebox Api
CLI or API | MCP | Agent
Version: 1.0.1
Documentation — Installation, deployment, usage across the API, CLI, MCP, and A2A agent interfaces, and guidance for provisioning the ArchiveBox platform are maintained in the official documentation.
Table of Contents
- Overview
- Key Features
- Concept Registry
- Environment Variables
- CLI or API Usage
- MCP Server Setup
- Agentic AI Graph Agent
- Security & Governance
- Installation
- Contribute
Overview
Archivebox Api is a production-grade Agent and Model Context Protocol (MCP) server designed to interface directly with the Pythonic ArchiveBox API Wrapper and Fast MCP Server for Agentic AI use!
Key Features
- Consolidated Action-Routed MCP Tools: Minimizes token overhead and eliminates tool bloat in LLM contexts by grouping methods into optimized, togglable tool modules.
- Enterprise-Grade Security: Comprehensive support for Eunomia policies, OIDC token delegation, and granular execution context tracking.
- Integrated Graph Agent: Built-in Pydantic AI agent supporting the Agent Control Protocol (ACP) and standard Web interfaces (AG-UI).
- Native Telemetry & Tracing: Out-of-the-box OpenTelemetry exports and native Langfuse tracing.
Concept Registry
This codebase is aligned with the 5 Core Pillars Architecture of the agent-utilities ecosystem:
| Concept ID | Pillar Name | Domain | Implementation Details in archivebox-api |
|---|---|---|---|
ECO-4.0 |
Ecosystem & Peripherals | Tool Interface & MCP Factory | Provides FastMCP server wrapper, action routing tools, and dynamic schema exposures. |
ECO-4.1 |
Ecosystem & Peripherals | A2A Network & Consensus | Manages agent peer discovery, routing tables, and consensus. |
OS-5.1 |
Agent OS Infrastructure | Security & Auth | Implements token-based OIDC access control, JWT filters, and Eunomia validation. |
OS-5.4 |
Agent OS Infrastructure | Telemetry & Observability | Delivers warning suppressions, JSON progress logging, and error tracing. |
Environment Variables
Package environment variables
| Variable | Example | Description |
|---|---|---|
HOST |
0.0.0.0 |
|
PORT |
8000 |
|
TRANSPORT |
stdio |
options: stdio, streamable-http, sse |
ENABLE_OTEL |
True |
|
OTEL_EXPORTER_OTLP_ENDPOINT |
http://localhost:8080/api/public/otel |
|
OTEL_EXPORTER_OTLP_PUBLIC_KEY |
pk-... |
|
OTEL_EXPORTER_OTLP_SECRET_KEY |
sk-... |
|
OTEL_EXPORTER_OTLP_PROTOCOL |
http/protobuf |
|
EUNOMIA_TYPE |
none |
options: none, embedded, remote |
EUNOMIA_POLICY_FILE |
mcp_policies.json |
|
EUNOMIA_REMOTE_URL |
http://eunomia-server:8000 |
|
ARCHIVEBOX_BASE_URL |
http://localhost:8000 |
|
ARCHIVEBOX_URL |
http://localhost:8000 |
ARCHIVEBOX_URL is a fallback/alternative alias for ARCHIVEBOX_BASE_URL |
ARCHIVEBOX_USERNAME |
— | |
ARCHIVEBOX_SSL_VERIFY |
False |
|
DEBUG |
False |
|
PYTHONUNBUFFERED |
1 |
|
ARCHIVEBOX_API_KEY |
your_archivebox_api_key_here |
|
ARCHIVEBOX_TOKEN |
your_archivebox_token_here |
|
ARCHIVEBOX_PASSWORD |
your_archivebox_password_here |
|
AUTHENTICATIONTOOL |
True |
|
CORETOOL |
True |
|
CLITOOL |
True |
Inherited agent-utilities variables (apply to every connector)
| Variable | Example | Description |
|---|---|---|
MCP_TOOL_MODE |
condensed |
Tool surface: condensed |
MCP_ENABLED_TOOLS |
— | Comma-separated tool allow-list |
MCP_DISABLED_TOOLS |
— | Comma-separated tool deny-list |
MCP_ENABLED_TAGS |
— | Comma-separated tag allow-list |
MCP_DISABLED_TAGS |
— | Comma-separated tag deny-list |
MCP_CLIENT_AUTH |
— | Outbound MCP auth (oidc-client-credentials for fleet calls) |
OIDC_CLIENT_ID |
— | OIDC client id (service-account auth) |
OIDC_CLIENT_SECRET |
— | OIDC client secret (service-account auth) |
MCP_URL |
http://localhost:8000/mcp |
URL of the MCP server the agent connects to |
PROVIDER |
openai |
LLM provider for the agent |
MODEL_ID |
gpt-4o |
Model id for the agent |
ENABLE_WEB_UI |
True |
Serve the AG-UI web interface |
23 package + 12 inherited variable(s). Auto-generated from .env.example + the shared agent-utilities set — do not edit.
Configure the runtime environment by creating a .env file based on .env.example.
Every variable the server reads, grouped by concern.
Connection & Credentials
| Variable | Description | Default |
|---|---|---|
ARCHIVEBOX_BASE_URL |
Canonical endpoint URL for the backend ArchiveBox API | http://localhost:8000 |
ARCHIVEBOX_URL |
Fallback alias/alternative for ARCHIVEBOX_BASE_URL |
http://localhost:8000 |
ARCHIVEBOX_USERNAME |
Username for authentication | — |
ARCHIVEBOX_PASSWORD |
Password for authentication | — |
ARCHIVEBOX_API_KEY |
API key for token-less header authentication | — |
ARCHIVEBOX_TOKEN |
Pre-configured authentication token | — |
ARCHIVEBOX_SSL_VERIFY |
Enable/disable SSL certificate validation | False |
MCP server / transport
| Variable | Description | Default |
|---|---|---|
TRANSPORT |
stdio, streamable-http, or sse |
stdio |
HOST |
Bind host (HTTP transports) | 0.0.0.0 |
PORT |
Bind port (HTTP transports) | 8000 |
MCP_TOOL_MODE |
Tool surface: condensed, verbose, or both |
condensed |
MCP_ENABLED_TOOLS / MCP_DISABLED_TOOLS |
Comma-separated tool allow/deny list | — |
MCP_ENABLED_TAGS / MCP_DISABLED_TAGS |
Comma-separated tag allow/deny list | — |
DEBUG |
Verbose logging | False |
PYTHONUNBUFFERED |
Unbuffered stdout (recommended in containers) | 1 |
Tool toggles
Each action-routed tool can be disabled individually via its toggle env var (set to false):
AUTHENTICATIONTOOL, CORETOOL, CLITOOL (see the Available MCP Tools table below).
Telemetry & governance
| Variable | Description | Default |
|---|---|---|
ENABLE_OTEL |
Enable OpenTelemetry export | True |
OTEL_EXPORTER_OTLP_ENDPOINT |
OTLP collector endpoint | — |
OTEL_EXPORTER_OTLP_PUBLIC_KEY / OTEL_EXPORTER_OTLP_SECRET_KEY |
OTLP auth keys | — |
OTEL_EXPORTER_OTLP_PROTOCOL |
OTLP protocol (e.g. http/protobuf) |
— |
EUNOMIA_TYPE |
Authorization mode: none, embedded, remote |
none |
EUNOMIA_POLICY_FILE |
Embedded policy file | mcp_policies.json |
EUNOMIA_REMOTE_URL |
Remote Eunomia server URL | — |
Agent CLI (full [agent] runtime only)
| Variable | Description | Default |
|---|---|---|
MCP_URL |
URL of the MCP server the agent connects to | http://localhost:8000/mcp |
PROVIDER |
LLM provider (e.g. openai) |
openai |
MODEL_ID |
Model id (e.g. gpt-4o) |
gpt-4o |
ENABLE_WEB_UI |
Serve the AG-UI web interface | True |
CLI or API Usage
You can use the API client programmatically in Python to manage ArchiveBox snapshots:
from archivebox_api import Api
# Initialize client
client = Api(
url="http://localhost:8000",
token="your-auth-token",
verify=True
)
# Fetch snapshots
snapshots = client.get_snapshots()
for snapshot in snapshots.get("results", []):
print(f"[{snapshot['timestamp']}] {snapshot['url']}")
Refer to docs/index.md for full developer SDK and class references.
MCP Server Setup
Install the slim
[mcp]extra. Installarchivebox-api[mcp]— the MCP-server extra that pulls only the FastMCP / FastAPI tooling (agent-utilities[mcp]). It deliberately excludes the heavy agent runtime (the epistemic-graph engine,pydantic-ai,dspy,llama-index,tree-sitter), souvx/container installs are dramatically smaller and faster. Use the full[agent]extra only when you need the integrated Pydantic AI agent (see Installation).
This server utilizes dynamic Action-Routed tools to optimize token overhead and maximize IDE compatibility.
Tool Catalog
See the auto-generated Available MCP Tools table below for the full, live list of tools.
Dynamic Tool Selection & Visibility
This MCP server supports dynamic toolset selection and visibility filtering at runtime. This allows you to restrict the set of exposed tools in order to prevent blowing up the LLM's context window.
You can configure tool filtering via multiple input channels:
- CLI Arguments: Pass
--toolsor--toolsets(or their disabled counterparts--disabled-toolsand--disabled-toolsets) during startup. - Environment Variables: Define standard environment variables:
MCP_ENABLED_TOOLS/MCP_DISABLED_TOOLSMCP_ENABLED_TAGS/MCP_DISABLED_TAGS
- HTTP SSE Request Headers: Pass custom headers during transport initialization:
x-mcp-enabled-tools/x-mcp-disabled-toolsx-mcp-enabled-tags/x-mcp-disabled-tags
- HTTP SSE Request Query Parameters: Append query parameters directly to your transport connection URL:
?tools=tool1,tool2?tags=tag1
When query strings or parameters are supplied, an LLM-free Knowledge Graph resolution layer (using DynamicToolOrchestrator) matches query intents against known tool tags, names, or descriptions, with safe fallback and automated 24-hour background cache refreshing.
Local IDE Configuration (Cursor / Claude Desktop)
Add the following block to your mcp.json to configure stdio transport via uvx:
{
"mcpServers": {
"archivebox-api": {
"command": "uv",
"args": [
"run",
"--package",
"archivebox-api",
"archivebox-mcp"
],
"env": {
"ARCHIVEBOX_BASE_URL": "http://localhost:8000",
"ARCHIVEBOX_USERNAME": "admin",
"ARCHIVEBOX_PASSWORD": "your-password"
}
}
}
}
Agentic AI Graph Agent
This repository features a fully integrated Pydantic AI Graph Agent. It communicates over the Agent Control Protocol (ACP) and interacts seamlessly with the Agent Web UI (AG-UI).
Running the Agent CLI
To start the interactive command-line agent:
# Export credentials
export ARCHIVEBOX_BASE_URL="http://localhost:8000"
export ARCHIVEBOX_USERNAME="admin"
export ARCHIVEBOX_PASSWORD="your-password"
# Run agent server
archivebox-agent --provider openai --model-id gpt-4o
Detailed graph node architecture explanations, custom skill configurations, and agentic trace guides are available in docs/index.md.
Security & Governance
Built directly upon the enterprise-ready agent-utilities core, standard security parameters are fully supported:
Access Control & Policy Enforcement
- Eunomia Policies: Fine-grained, policy-driven tool authorization. Supports
none, localembedded(mcp_policies.json), or centralizedremotemodes. - OIDC Token Delegation: Compliant with RFC 8693 token exchange for flowing authenticating user credentials from Web UI / ACP → Agent → MCP.
- Scoped Credentials: Execution context runs restricted to the specific caller identity.
Runtime Security Grid
| Feature | Functionality | Enablement |
|---|---|---|
| Tool Guard | Sensitivity inspection with human-in-the-loop validation | Enabled by default |
| Prompt Injection Defense | Input scanning, repetition monitoring, and recursive loop blocks | Enabled by default |
| Context Safety Guard | Stuck-loop detectors and contextual overflow preemptive alerts | Enabled by default |
Installation
Pick the extra that matches what you want to run:
| Extra | Installs | Use when |
|---|---|---|
archivebox-api[mcp] |
Slim MCP server only (agent-utilities[mcp] — FastMCP/FastAPI) |
You only run the MCP server (smallest install / image) |
archivebox-api[agent] |
Full agent runtime (agent-utilities[agent,logfire] — Pydantic AI + the epistemic-graph engine) |
You run the integrated agent |
archivebox-api[all] |
Everything (mcp + agent + logfire) |
Development / both surfaces |
# MCP server only (recommended for tool hosting — slim deps)
uv pip install "archivebox-api[mcp]"
# Full agent runtime (Pydantic AI + epistemic-graph engine)
uv pip install "archivebox-api[agent]"
# Everything (development)
uv pip install "archivebox-api[all]" # or: python -m pip install "archivebox-api[all]"
Container images (:mcp vs :agent)
One multi-stage docker/Dockerfile builds two right-sized images, selected by --target:
| Image tag | Build target | Contents | Entrypoint |
|---|---|---|---|
knucklessg1/archivebox-api:mcp |
--target mcp |
archivebox-api[mcp] — slim, no engine/pydantic-ai/dspy/llama-index/tree-sitter |
archivebox-mcp |
knucklessg1/archivebox-api:latest |
--target agent (default) |
archivebox-api[agent] — full agent runtime + epistemic-graph engine |
archivebox-agent |
docker build --target mcp -t knucklessg1/archivebox-api:mcp docker/ # slim MCP server
docker build --target agent -t knucklessg1/archivebox-api:latest docker/ # full agent
docker/mcp.compose.yml runs the slim :mcp server; docker/agent.compose.yml runs the
agent (:latest) with a co-located :mcp sidecar.
Knowledge-graph database (epistemic-graph)
The full agent ([agent] / :latest) embeds the epistemic-graph engine (pulled in
transitively via agent-utilities[agent]). For production — or to share one knowledge graph
across multiple agents — run epistemic-graph as its own database container and point the
agent at it instead of embedding it. Deployment recipes (single-node + Raft HA), connection
config, and the full database architecture (with diagrams) are documented in the
epistemic-graph deployment guide.
The slim [mcp] server does not require the database.
Documentation
The complete documentation is published as the official documentation site and is the recommended reference for installation, deployment, and day-to-day operation.
| Page | Contents |
|---|---|
| Installation | pip, source, extras, prebuilt Docker image |
| Deployment | run the MCP and agent servers, Compose, Caddy + Technitium, env config |
| Usage | the MCP tools, the Api client, the CLI |
| Backing Platform | deploy ArchiveBox with Docker |
| Overview | ecosystem role, configuration, architecture |
| Concepts | concept registry (CONCEPT:ABOX-*) |
AGENTS.md is the canonical contributor/agent guidance.
Contribute
Contributions are welcome! Please ensure code quality by executing local checks before submitting pull requests:
- Format code using
ruff format . - Lint code using
ruff check . - Validate type-safety with
mypy . - Execute test suites using
pytest
Available MCP Tools
The table below is auto-generated from the live server — do not edit by hand.
Condensed action-routed tools (default — MCP_TOOL_MODE=condensed)
| MCP Tool | Toggle Env Var | Description |
|---|---|---|
archivebox_authentication |
AUTHENTICATIONTOOL |
Manage archivebox authentication operations. |
archivebox_cli |
CLITOOL |
Manage archivebox cli operations. |
archivebox_core |
CORETOOL |
Manage archivebox core operations. |
Verbose 1:1 API-mapped tools (MCP_TOOL_MODE=verbose or both)
14 per-operation tools — one per public API method (click to expand)
| MCP Tool | Toggle Env Var | Description |
|---|---|---|
archivebox_check_api_token |
APITOOL |
Validate an API token to make sure it's valid and non-expired |
archivebox_cli_add |
APITOOL |
Execute archivebox add command |
archivebox_cli_list |
APITOOL |
Execute archivebox list command |
archivebox_cli_remove |
APITOOL |
Execute archivebox remove command |
archivebox_cli_schedule |
APITOOL |
Execute archivebox schedule command |
archivebox_cli_update |
APITOOL |
Execute archivebox update command |
archivebox_get_any |
APITOOL |
Get a specific Snapshot, ArchiveResult, or Tag by abid |
archivebox_get_api_token |
APITOOL |
Generate an API token for a given username & password |
archivebox_get_archiveresult |
APITOOL |
Get a specific ArchiveResult by id or abid |
archivebox_get_archiveresults |
APITOOL |
List all ArchiveResult entries matching these filters |
archivebox_get_snapshot |
APITOOL |
Get a specific Snapshot by abid or id |
archivebox_get_snapshots |
APITOOL |
Retrieve list of snapshots |
archivebox_get_tag |
APITOOL |
Get a specific Tag by id or abid |
archivebox_get_tags |
APITOOL |
Retrieve list of tags |
3 action-routed tool(s) (default) · 14 verbose 1:1 tool(s). Each is enabled unless its <DOMAIN>TOOL toggle is set false; MCP_TOOL_MODE selects the surface (condensed default · verbose 1:1 · both). Auto-generated — do not edit.
Additional Deployment Options
archivebox-api can also run as a local container (Docker / Podman / uv) or be
consumed from a remote deployment. The
Deployment guide has full, copy-paste
mcp_config.json for all four transports — stdio, streamable-http,
local container / uv, and remote URL:
- Local container / uv — launch the server from
mcp_config.jsonviauvx,docker run, orpodman run, or point at a local streamable-http container byurl. - Remote URL — connect to a server deployed behind Caddy at
http://archivebox-mcp.arpa/mcpusing the"url"key.
Deploy with agent-os-genesis
This package can be provisioned for you — skill-guided — by the agent-os-genesis
universal skill (its single-package deploy mode): it picks your install method, seeds
secrets to OpenBao/Vault (or .env), trusts your enterprise CA, registers the MCP
server, and verifies it — the same machinery that stands up the whole Agent OS, narrowed
to just this package. Ask your agent to "deploy archivebox-api with agent-os-genesis".
| Install mode | Command |
|---|---|
| Bare-metal, prod (PyPI) | uvx archivebox-mcp · or uv tool install archivebox-api |
| Bare-metal, dev (editable) | uv pip install -e ".[all]" · or pip install -e ".[all]" |
| Container, prod | deploy knucklessg1/archivebox-api:latest via docker-compose / swarm / podman / podman-compose / kubernetes |
| Container, dev (editable) | deploy docker/compose.dev.yml (source-mounted at /src; edits live on restart) |
Secrets are read-existing + seeded via vault_sync — you are only prompted for what's missing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file archivebox_api-1.0.1.tar.gz.
File metadata
- Download URL: archivebox_api-1.0.1.tar.gz
- Upload date:
- Size: 51.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1580b7765c855a2ae1e46fb0f8a621419e369844eced2202a3194667066cf1d8
|
|
| MD5 |
a331d01b2fc742dc56cbaffbd507bfab
|
|
| BLAKE2b-256 |
f6b31d6988c77340f344b10ccbe4387bd5ad5f98523852ceb43ee8cdb577cec4
|
File details
Details for the file archivebox_api-1.0.1-py3-none-any.whl.
File metadata
- Download URL: archivebox_api-1.0.1-py3-none-any.whl
- Upload date:
- Size: 45.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0cd6cd140a0a5f124e6ecaa4d27a011f16a55fa90e5e44195ec4a123048f8f16
|
|
| MD5 |
fe6ea4f6632559a3b34ed3fc36d2f481
|
|
| BLAKE2b-256 |
7d8edd3c00923242ae8b82e3e965b6f77468807c1dddb5fe3fde2a0131731b14
|