Skip to main content

Pythonic ArchiveBox API Wrapper and Fast MCP Server for Agentic AI use!

Project description

Archivebox Api

CLI or API | MCP | Agent

PyPI - Version MCP Server PyPI - Downloads GitHub Repo stars GitHub forks GitHub contributors PyPI - License GitHub GitHub last commit (by committer) GitHub pull requests GitHub closed pull requests GitHub issues GitHub top language GitHub language count GitHub repo size GitHub repo file count (file type) PyPI - Wheel PyPI - Implementation

Version: 0.14.0


Overview

Archivebox Api is a production-grade Agent and Model Context Protocol (MCP) server designed to interface directly with Pythonic ArchiveBox API Wrapper and Fast MCP Server for Agentic AI use!.


Key Features

  • Consolidated Action-Routed MCP Tools: Minimizes token overhead and eliminates tool bloat in LLM contexts by grouping methods into optimized, togglable tool modules.
  • Enterprise-Grade Security: Comprehensive support for Eunomia policies, OIDC token delegation, and granular execution context tracking.
  • Integrated Graph Agent: Built-in Pydantic AI agent supporting the Agent Control Protocol (ACP) and standard Web interfaces (AG-UI).
  • Native Telemetry & Tracing: Out-of-the-box OpenTelemetry exports and native Langfuse tracing.

CLI or API

This agent wraps the Pythonic ArchiveBox API Wrapper and Fast MCP Server for Agentic AI use! API. You can interact with it programmatically or via its integrated execution entrypoints.

Detailed instructions on how to use the underlying API wrappers, extended schema bindings, and developer SDK references are maintained in docs/index.md.


MCP

This server utilizes dynamic Action-Routed tools to optimize token overhead and maximize IDE compatibility.

Available MCP Tools

Tool Module Toggle Env Var Enabled by Default Description & Nested Methods
Authentication AUTHENTICATIONTOOL True Manage archivebox authentication operations. Action-routed methods: get_api_token, check_api_token.
Core CORETOOL True Manage archivebox core operations. Action-routed methods: get_snapshots, get_snapshot, get_archiveresults, get_tag, get_any.
Cli CLITOOL True Manage archivebox cli operations. Action-routed methods: cli_add, cli_update, cli_schedule, cli_list, cli_remove.

Detailed tool schemas, parameter shapes, and validation constraints are preserved in docs/mcp.md.

MCP Configuration Examples

stdio Transport (Recommended for local IDEs e.g., Cursor, Claude Desktop)

Configure your IDE's mcp.json to launch the MCP server via uvx:

{
  "mcpServers": {
    "archivebox-api": {
      "command": "uvx",
      "args": [
        "--from",
        "archivebox-api",
        "archivebox-mcp"
      ],
      "env": {
        "ARCHIVEBOX_BASE_URL": "your_archivebox_base_url_here",
        "ARCHIVEBOX_USERNAME": "your_archivebox_username_here",
        "ARCHIVEBOX_SSL_VERIFY": "your_archivebox_ssl_verify_here",
        "DEBUG": "your_debug_here",
        "PYTHONUNBUFFERED": "your_pythonunbuffered_here",
        "ARCHIVEBOX_API_KEY": "your_archivebox_api_key_here",
        "ARCHIVEBOX_TOKEN": "your_archivebox_token_here",
        "ARCHIVEBOX_PASSWORD": "your_archivebox_password_here"
      }
    }
  }
}

Streamable-HTTP Transport (Recommended for production deployments)

Configure your client's mcp.json to launch the Streamable-HTTP server via uvx with explicit host and port definition:

{
  "mcpServers": {
    "archivebox-api": {
      "command": "uvx",
      "args": [
        "--from",
        "archivebox-api",
        "archivebox-mcp"
      ],
      "env": {
        "TRANSPORT": "streamable-http",
        "HOST": "0.0.0.0",
        "PORT": "8000",
        "ARCHIVEBOX_BASE_URL": "your_archivebox_base_url_here",
        "ARCHIVEBOX_USERNAME": "your_archivebox_username_here",
        "ARCHIVEBOX_SSL_VERIFY": "your_archivebox_ssl_verify_here",
        "DEBUG": "your_debug_here",
        "PYTHONUNBUFFERED": "your_pythonunbuffered_here",
        "ARCHIVEBOX_API_KEY": "your_archivebox_api_key_here",
        "ARCHIVEBOX_TOKEN": "your_archivebox_token_here",
        "ARCHIVEBOX_PASSWORD": "your_archivebox_password_here"
      }
    }
  }
}

Alternatively, connect to a pre-deployed remote or local Streamable-HTTP instance:

{
  "mcpServers": {
    "archivebox-api": {
      "url": "http://localhost:8000/archivebox-api/mcp"
    }
  }
}

Deploying the Streamable-HTTP server via Docker:

docker run -d \
  --name archivebox-api-mcp \
  -p 8000:8000 \
  -e TRANSPORT=streamable-http \
  -e PORT=8000 \
  -e ARCHIVEBOX_BASE_URL="your_value" \
  -e ARCHIVEBOX_USERNAME="your_value" \
  -e ARCHIVEBOX_SSL_VERIFY="your_value" \
  -e DEBUG="your_value" \
  -e PYTHONUNBUFFERED="your_value" \
  -e ARCHIVEBOX_API_KEY="your_value" \
  -e ARCHIVEBOX_TOKEN="your_value" \
  -e ARCHIVEBOX_PASSWORD="your_value" \
  knucklessg1/archivebox-api:latest

Agent

This repository features a fully integrated Pydantic AI Graph Agent. It communicates over the Agent Control Protocol (ACP) and interacts seamlessly with the Agent Web UI (AG-UI) and Terminal interface.

Running the Agent CLI

To start the interactive command-line agent:

# Set credentials
export ARCHIVEBOX_BASE_URL="your_value"
export ARCHIVEBOX_USERNAME="your_value"
export ARCHIVEBOX_SSL_VERIFY="your_value"
export DEBUG="your_value"
export PYTHONUNBUFFERED="your_value"
export ARCHIVEBOX_API_KEY="your_value"
export ARCHIVEBOX_TOKEN="your_value"
export ARCHIVEBOX_PASSWORD="your_value"

# Run the agent server
archivebox-agent --provider openai --model-id gpt-4o

Docker Compose Orchestration

The following docker/agent.compose.yml configures the Agent, Web UI, and Terminal Interface together:

version: '3.8'

services:
  archivebox-api-mcp:
    image: knucklessg1/archivebox-api:latest
    container_name: archivebox-api-mcp
    hostname: archivebox-api-mcp
    restart: always
    env_file:
      - ../.env
    environment:
      - PYTHONUNBUFFERED=1
      - HOST=0.0.0.0
      - PORT=8000
      - TRANSPORT=streamable-http
    ports:
      - "8000:8000"
    healthcheck:
      test: ["CMD", "python3", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

  archivebox-api-agent:
    image: knucklessg1/archivebox-api:latest
    container_name: archivebox-api-agent
    hostname: archivebox-api-agent
    restart: always
    depends_on:
      - archivebox-api-mcp
    env_file:
      - ../.env
    command: [ "archivebox-agent" ]
    environment:
      - PYTHONUNBUFFERED=1
      - HOST=0.0.0.0
      - PORT=9013
      - MCP_URL=http://archivebox-api-mcp:8000/mcp
      - PROVIDER=${PROVIDER:-openai}
      - MODEL_ID=${MODEL_ID:-gpt-4o}
      - ENABLE_WEB_UI=True
      - ENABLE_OTEL=True
    ports:
      - "9013:9013"
    healthcheck:
      test: ["CMD", "python3", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:9013/health')"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

Detailed graph node architecture explanations, custom skill configurations, and agentic trace guides are available in docs/agent.md.


Security & Governance

Built directly upon the enterprise-ready agent-utilities core, standard security parameters are fully supported:

Access Control & Policy Enforcement

  • Eunomia Policies: Fine-grained, policy-driven tool authorization. Supports none, local embedded (mcp_policies.json), or centralized remote modes.
  • OIDC Token Delegation: Compliant with RFC 8693 token exchange for flowing authenticating user credentials from Web UI / ACP → Agent → MCP.
  • Scoped Credentials: Execution context runs restricted to the specific caller identity.

Runtime Security Grid

Feature Functionality Enablement
Tool Guard Sensitivity inspection with human-in-the-loop validation Enabled by default
Prompt Injection Defense Input scanning, repetition monitoring, and recursive loop blocks Enabled by default
Context Safety Guard Stuck-loop detectors and contextual overflow preemptive alerts Enabled by default

Installation

Install the Python package locally:

# Using uv (highly recommended)
uv pip install archivebox-api[all]

# Using standard pip
python -m pip install archivebox-api[all]

Repository Owners

GitHub followers GitHub User's stars


Contribute

Contributions are welcome! Please ensure code quality by executing local checks before submitting pull requests:

  • Format code using ruff format .
  • Lint code using ruff check .
  • Validate type-safety with mypy .
  • Execute test suites using pytest

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

archivebox_api-0.14.0.tar.gz (24.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

archivebox_api-0.14.0-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file archivebox_api-0.14.0.tar.gz.

File metadata

  • Download URL: archivebox_api-0.14.0.tar.gz
  • Upload date:
  • Size: 24.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for archivebox_api-0.14.0.tar.gz
Algorithm Hash digest
SHA256 c93bc25a68eb22b896befbb181614d7d63924dd2f29ecaf8bd240de80b3cb3d6
MD5 233083aa9d46bee38e28f73dd6dd570c
BLAKE2b-256 101601d60c67ba8aca27f44701c39026e06bc24f105690c831a92ad826606919

See more details on using hashes here.

File details

Details for the file archivebox_api-0.14.0-py3-none-any.whl.

File metadata

File hashes

Hashes for archivebox_api-0.14.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ea2df1314798307e17601090a9bd27a3d751b6f70ea1fa5ffb80849b6867549a
MD5 998457171137d2742c62145efdf0d22c
BLAKE2b-256 36562942ef68b788208a1ee3d9136769e875f1f79378aa10f1fe54435f36a16f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page