arcllm

Unified LLM abstraction layer for autonomous agents

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

╭──────────────────────────────────────────────────────╮
│                                                      │
│   ▄▀█ █▀█ █▀▀ █   █   █▀▄▀█                        │
│   █▀█ █▀▄ █▄▄ █▄▄ █▄▄ █ ▀ █                        │
│                                                      │
│   Unified LLM Abstraction Layer                      │
│   for Autonomous Agents at Scale                     │
│                                                      │
├──────────────────────────────────────────────────────┤
│  11 Providers · 8 Modules · 2 Dependencies · <1ms   │
╰──────────────────────────────────────────────────────╯

A minimal, security-first LLM abstraction layer built for autonomous agents at scale.

ArcLLM normalizes communication across LLM providers into a single, clean interface designed for agentic tool-calling loops. One function to load a model, one method to invoke it, normalized responses every time — regardless of provider.

from arcllm import load_model, Message

model = load_model("anthropic")

response = await model.invoke([
    Message(role="user", content="What is 2 + 2?")
])

print(response.content)       # "4"
print(response.usage)         # input_tokens=12 output_tokens=4 total_tokens=16
print(response.stop_reason)   # "end_turn"

Switch providers by changing one string. Your agent code stays the same.

Why ArcLLM

Built for federal and enterprise production environments where thousands of autonomous agents run concurrently and security, auditability, and control are non-negotiable.

Security first — API keys from environment variables or vault backends. PII redaction, HMAC request signing, and audit trails built in. No secrets in config files, ever.
Agent-native — Purpose-built for agentic tool-calling loops, not chat interfaces. Stateless model objects. Your agent manages its own conversation history.
Minimal core — Two runtime dependencies (pydantic, httpx). No provider SDKs. Direct HTTP to every provider. Import time under 100ms, abstraction overhead under 1ms per call.
Budget enforcement — Per-scope spend tracking with calendar period resets. Pre-flight cost estimation. Per-call, daily, and monthly limits with configurable enforcement (block, warn, log).
Classification-aware routing — Route LLM calls to specific providers based on data classification. CUI stays on cleared infrastructure. Unclassified goes to cost-optimized providers.
Opt-in complexity — Need just Anthropic with no extras? That's all that loads. Need retry, fallback, telemetry, audit, routing, and budget controls? Enable them with a flag. Nothing runs that you didn't ask for.
Config-driven — Model metadata, provider settings, and module toggles live in TOML files. Add a provider by dropping in one .toml file. Zero code changes.

Supported Providers

Provider	Type	Adapter
Anthropic	Cloud	`anthropic`
OpenAI	Cloud	`openai`
DeepSeek	Cloud	`deepseek`
Mistral	Cloud	`mistral`
Groq	Cloud	`groq`
Together AI	Cloud	`together`
Fireworks AI	Cloud	`fireworks`
Hugging Face Inference	Cloud	`huggingface`
Hugging Face TGI	Self-hosted	`huggingface_tgi`
Ollama	Local	`ollama`
vLLM	Self-hosted	`vllm`

Every adapter translates provider-specific quirks (role names, content formats, tool call schemas) so your agent code never has to.

Opt-In Modules

All disabled by default. Enable via config or at load time.

Module	What It Does
Retry	Exponential backoff on transient errors (429, 500, 503). Respects `Retry-After` headers.
Fallback	Provider chain — if Anthropic fails, try OpenAI. Configurable order.
Rate Limit	Token-bucket throttling per provider. Prevents quota exhaustion across concurrent agents.
Telemetry	Timing, token counts, cost-per-call, and budget enforcement with automatic pricing from model metadata.
Audit	Structured call logging with metadata for compliance trails. PII-safe by default.
Security	PII redaction on requests/responses, HMAC request signing, vault-based key resolution.
Routing	Classification-aware provider/model selection. Route CUI to cleared providers, unclassified to cost-optimized.
OpenTelemetry	Distributed tracing export via OTLP (gRPC or HTTP). GenAI semantic conventions.

Enable at load time:

model = load_model("anthropic", retry=True, telemetry=True, audit=True)

Or override with custom settings:

model = load_model("anthropic", retry={"max_retries": 5, "backoff_base_seconds": 2.0})

Or enable globally in config.toml:

[modules.retry]
enabled = true
max_retries = 3
backoff_base_seconds = 1.0

Budget Enforcement

Control LLM spend at every level — per-call, daily, and monthly — with configurable enforcement:

[modules.telemetry]
enabled = true
monthly_limit_usd = 500.00
daily_limit_usd = 50.00
per_call_max_usd = 5.00
alert_threshold_pct = 80
enforcement = "block"        # block | warn | log
budget_scope = "my-agent"

Budget scopes are validated with NFKC Unicode normalization. Costs are clamped to prevent negative injection. Calendar period resets are automatic (UTC monthly/daily). Thread-safe for concurrent agents.

Classification-Aware Routing

Route LLM calls based on data classification:

[modules.routing]
enabled = true
enforcement = "block"
default_classification = "unclassified"

[modules.routing.rules.cui]
provider = "anthropic"
model = "claude-sonnet-4-6"

[modules.routing.rules.unclassified]
provider = "openai"
model = "gpt-4o-mini"

CUI data stays on cleared providers. Unclassified data goes to cost-optimized providers. Enforcement modes: block (hard stop), warn (log + continue), log (silent).

Installation

pip install -e "."

With dev tools:

pip install -e ".[dev]"

With OpenTelemetry export:

pip install -e ".[otel]"

With ECDSA request signing:

pip install -e ".[signing]"

Requirements: Python 3.12+

Setup

1. Set your API key

ArcLLM reads API keys from environment variables by default. Never from config files.

export ANTHROPIC_API_KEY=your-key-here

See .env.example for all supported providers.

Vault integration (optional)

For enterprise environments, ArcLLM resolves API keys from vault backends with TTL caching and automatic env var fallback. Configure in config.toml:

[vault]
backend = "my_vault_module:HashicorpVaultBackend"
cache_ttl_seconds = 300

Then set vault paths per provider in their TOML files:

[provider]
api_key_env = "ANTHROPIC_API_KEY"
vault_path = "secret/data/llm/anthropic"

Resolution order: vault (cached) -> environment variable -> error. The vault backend is a pluggable protocol — implement get_secret(path) and is_available() for any secrets manager (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, etc.).

2. Load and invoke

from arcllm import load_model, Message

model = load_model("anthropic")

response = await model.invoke([
    Message(role="user", content="Summarize this document.")
])

print(response.content)

Use the async context manager to ensure clean connection shutdown:

async with load_model("anthropic") as model:
    response = await model.invoke(messages)

3. Switch providers

model = load_model("openai")          # OpenAI
model = load_model("groq")            # Groq
model = load_model("ollama")          # Local Ollama
model = load_model("together")        # Together AI

Same Message types, same invoke() call, same LLMResponse back.

Tool-Calling (Agentic Loop)

This is what ArcLLM was built for. Define tools, send them with your messages, and handle the loop:

from arcllm import load_model, Message, Tool, TextBlock, ToolUseBlock, ToolResultBlock

model = load_model("anthropic")

# Define a tool
search_tool = Tool(
    name="web_search",
    description="Search the web for current information.",
    parameters={
        "type": "object",
        "properties": {
            "query": {"type": "string", "description": "The search query"}
        },
        "required": ["query"],
    },
)

messages = [Message(role="user", content="Search for the latest Python release.")]

# Agentic loop
while True:
    response = await model.invoke(messages, tools=[search_tool])

    if response.stop_reason == "end_turn":
        print(response.content)
        break

    if response.stop_reason == "tool_use":
        # Pack the assistant's response back into messages
        assistant_content = []
        if response.content:
            assistant_content.append(TextBlock(text=response.content))
        for tc in response.tool_calls:
            assistant_content.append(
                ToolUseBlock(id=tc.id, name=tc.name, arguments=tc.arguments)
            )
        messages.append(Message(role="assistant", content=assistant_content))

        # Execute tools and send results back
        for tc in response.tool_calls:
            result = execute_tool(tc.name, tc.arguments)  # your implementation
            messages.append(Message(
                role="tool",
                content=[ToolResultBlock(tool_use_id=tc.id, content=result)],
            ))

Every provider returns the same LLMResponse with the same ToolCall objects and the same stop_reason values. Your agentic loop works across all 11 providers without modification.

Core Types

ArcLLM's type system is the contract between your agent and any LLM provider.

Type	Purpose
`Message`	Input message with `role` and `content`
`Tool`	Tool definition sent to the LLM
`LLMResponse`	Normalized response: `content`, `tool_calls`, `usage`, `stop_reason`
`ToolCall`	Parsed tool call: `id`, `name`, `arguments` (always a dict)
`Usage`	Token accounting: input, output, total, cache, reasoning
`ContentBlock`	Union of `TextBlock`, `ImageBlock`, `ToolUseBlock`, `ToolResultBlock`

All types are Pydantic v2 models with full validation and serialization.

Architecture

Agent Code
    |
load_model() ---- Public API
    |
Modules ---------- opt-in: retry, fallback, telemetry, audit, security, routing, otel
    |
Adapter ---------- provider-specific translation (one .py per provider)
    |
Types ------------ pydantic models (the universal contract)
    |
Config ----------- TOML files (global defaults + per-provider metadata)

Design principles:

Library, not a framework — import what you need, nothing more
No state in the LLM layer — model objects hold config, agents hold conversation
Provider quirks stay in adapters — your code sees clean, normalized types
Fail fast, fail loud — errors carry raw data, nothing is silently swallowed
Config-driven — add a provider by dropping in a TOML file, not writing code

Simplicity by the Numbers

ArcLLM is radically smaller than alternatives. This is a design choice, not a limitation.

Metric	ArcLLM	pi-ai	LiteLLM
Source LOC	~2,900	~22,600	~475,000
Source files	33	38	1,558
Runtime deps	3	13	12+
Providers	11	9	100+

LOC per provider: ArcLLM averages ~60 lines per provider adapter. Most are 11-line thin aliases over the OpenAI-compatible base. pi-ai averages ~630 lines per provider. LiteLLM averages ~1,300 lines per provider.

Why this matters:

Auditable — A security reviewer can read the entire LLM layer in an afternoon. Try that with 475K lines.
Debuggable — When something breaks, the call stack is shallow. No framework magic, no middleware chains you can't trace.
Maintainable — Fewer lines means fewer bugs. Every line in ArcLLM exists because it has to, not because a feature flag needed a feature flag.
Fast — 3 runtime dependencies means fast installs, small container images, and minimal attack surface. pi-ai requires @anthropic-ai/sdk, openai, @google/genai, @aws-sdk/client-bedrock-runtime, and 9 more packages. LiteLLM requires openai, tiktoken, tokenizers, aiohttp, jinja2, jsonschema, and more.

The tradeoff is provider count: LiteLLM supports 100+ providers because it wraps provider SDKs. pi-ai wraps 4 provider SDKs (Anthropic, OpenAI, Google, AWS) to cover 9 API backends. ArcLLM supports 11 providers via direct HTTP — no SDKs, no transitive dependency trees. Adding a new OpenAI-compatible provider is an 11-line file and a TOML config.

LOC measured with find src -name "*.py" | xargs wc -l (ArcLLM) and find src -name "*.ts" | xargs wc -l (pi-ai), excluding tests. Competitor numbers from public GitHub repos as of February 2026.

Configuration

Global defaults (src/arcllm/config.toml):

[defaults]
provider = "anthropic"
temperature = 0.7
max_tokens = 4096

[vault]
backend = ""
cache_ttl_seconds = 300

[modules.retry]
enabled = false
max_retries = 3
backoff_base_seconds = 1.0

[modules.fallback]
enabled = false
chain = ["anthropic", "openai"]

[modules.routing]
enabled = false
enforcement = "warn"
default_classification = "unclassified"

[modules.telemetry]
enabled = false
log_level = "INFO"
# monthly_limit_usd = 500.00
# daily_limit_usd = 50.00
# per_call_max_usd = 5.00
# alert_threshold_pct = 80
# enforcement = "block"

[modules.security]
enabled = false
pii_enabled = true
signing_enabled = true
signing_algorithm = "hmac-sha256"
signing_key_env = "ARCLLM_SIGNING_KEY"

Provider config (src/arcllm/providers/anthropic.toml):

[provider]
base_url = "https://api.anthropic.com"
api_key_env = "ANTHROPIC_API_KEY"
default_model = "claude-sonnet-4-20250514"
vault_path = ""

[models.claude-sonnet-4-20250514]
context_window = 200000
max_output_tokens = 8192
supports_tools = true
supports_vision = true
cost_input_per_1m = 3.00
cost_output_per_1m = 15.00

Model metadata (context windows, capabilities, pricing) lives in config, not code. Update a model's pricing or add a new model variant without touching a single line of Python.

Running Tests

pytest -v                       # Unit + adapter tests (mocked)
pytest --cov=arcllm             # With coverage
pytest tests/security/          # Security-specific tests
pytest tests/test_agentic_loop.py  # Live API test (requires ANTHROPIC_API_KEY)

Security

ArcLLM is built security-first for federal production environments. See docs/security.md for the full security reference including NIST 800-53 and OWASP mapping.

Key capabilities:

API key isolation — Keys from env vars or vault only. Never in config, logs, or responses.
PII redaction — Automatic detection and redaction of SSN, credit cards, emails, phone numbers, and IPs on both inbound and outbound messages.
Request signing — HMAC-SHA256 signatures on every request payload for tamper detection and non-repudiation.
Vault integration — Pluggable secrets backend with TTL caching. Supports any vault (HashiCorp, AWS SM, Azure KV).
Audit trails — Structured compliance logging. PII-safe metadata by default, raw content opt-in at DEBUG.
Budget enforcement — Per-scope spend limits with negative cost injection prevention, Unicode homoglyph attack resistance, and thread-safe accumulation.
Classification routing — Data classification-aware provider selection prevents CUI from reaching unauthorized providers.
TLS enforced — All provider communication over HTTPS via httpx defaults.
OpenTelemetry — Distributed tracing with mTLS support for secure telemetry export.

Project Status

ArcLLM is in active development. Core foundation, all provider adapters, the module system, budget enforcement, and classification routing are complete and tested.

Phase	Status
Core Foundation (types, config, adapters, registry)	Complete
Module System (retry, fallback, rate limit, telemetry, audit, security, otel)	Complete
Enterprise (vault integration, request signing, PII redaction)	Complete
Budget enforcement (per-scope, daily, monthly, per-call)	Complete
Classification-aware routing	Complete

License

This project is licensed under the Apache License, Version 2.0.

Acknowledgments

ArcLLM was inspired by pi-ai (from pi-mono by Mario Zechner) and LiteLLM. We studied their architectures and built something purpose-fit for Python, federal environments, and autonomous agent fleets.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

joshuamschultz

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Feb 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcllm-0.2.0.tar.gz (184.6 kB view details)

Uploaded Feb 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

arcllm-0.2.0-py3-none-any.whl (59.5 kB view details)

Uploaded Feb 25, 2026 Python 3

File details

Details for the file arcllm-0.2.0.tar.gz.

File metadata

Download URL: arcllm-0.2.0.tar.gz
Upload date: Feb 25, 2026
Size: 184.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arcllm-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`c56b3e8a0435c5e6738f4259cd6b1581d1203847abf4722e0bfedec54d350c34`
MD5	`97c36f7429ca351b08e71155bb492f4f`
BLAKE2b-256	`d5fd650051f302026f0c36ad7832efae3814449d5f69b0e808769bd48f06c55a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcllm-0.2.0.tar.gz:

Publisher: publish-arcllm.yml on joshuamschultz/Arc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: arcllm-0.2.0.tar.gz
- Subject digest: c56b3e8a0435c5e6738f4259cd6b1581d1203847abf4722e0bfedec54d350c34
- Sigstore transparency entry: 991967566
- Sigstore integration time: Feb 25, 2026
Source repository:
- Permalink: joshuamschultz/Arc@1b000b26b8ca9dc429c121a10da36ed3f119d6c2
- Branch / Tag: refs/heads/main
- Owner: https://github.com/joshuamschultz
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-arcllm.yml@1b000b26b8ca9dc429c121a10da36ed3f119d6c2
- Trigger Event: workflow_dispatch

File details

Details for the file arcllm-0.2.0-py3-none-any.whl.

File metadata

Download URL: arcllm-0.2.0-py3-none-any.whl
Upload date: Feb 25, 2026
Size: 59.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arcllm-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`52c63c2a936f77ae61929da7727e994751363bd05b5b6013f906d161497bb5f7`
MD5	`c52a9cc51a742d73ba7f5e001d214379`
BLAKE2b-256	`0c450c56ffc57e4869e9f827870fa761b168fddc92012dbf23c00db5f795847d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcllm-0.2.0-py3-none-any.whl:

Publisher: publish-arcllm.yml on joshuamschultz/Arc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: arcllm-0.2.0-py3-none-any.whl
- Subject digest: 52c63c2a936f77ae61929da7727e994751363bd05b5b6013f906d161497bb5f7
- Sigstore transparency entry: 991967567
- Sigstore integration time: Feb 25, 2026
Source repository:
- Permalink: joshuamschultz/Arc@1b000b26b8ca9dc429c121a10da36ed3f119d6c2
- Branch / Tag: refs/heads/main
- Owner: https://github.com/joshuamschultz
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-arcllm.yml@1b000b26b8ca9dc429c121a10da36ed3f119d6c2
- Trigger Event: workflow_dispatch

arcllm 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Why ArcLLM

Supported Providers

Opt-In Modules

Budget Enforcement

Classification-Aware Routing

Installation

Setup

1. Set your API key

Vault integration (optional)

2. Load and invoke

3. Switch providers

Tool-Calling (Agentic Loop)

Core Types

Architecture

Simplicity by the Numbers

Configuration

Running Tests

Security

Project Status

License

Acknowledgments

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance