Skip to main content

OpenTerms protocol integration for LangChain: permission-aware AI agents

Project description

langchain-openterms

Permission-aware AI agents for LangChain. Checks a domain's openterms.json before your agent acts, so it knows what it's allowed to do.

Install

pip install langchain-openterms

What it does

When your LangChain agent interacts with a website, this package checks /.well-known/openterms.json on that domain first. If the site says scraping is denied, the agent gets a clear denial instead of executing and getting blocked or creating legal exposure.

Three ways to use it

1. Wrap a tool (recommended)

OpenTermsGuard wraps any existing tool with a permission check. If the domain denies the action, the tool returns a denial message instead of executing.

from langchain_community.tools import DuckDuckGoSearchRun
from langchain_openterms import OpenTermsGuard

search = DuckDuckGoSearchRun()

# Wraps the search tool: checks "read_content" before each query
guarded_search = OpenTermsGuard(
    tool=search,
    action="read_content",
)

# Use guarded_search in your agent instead of search.
# If a domain denies read_content, the agent gets a message
# explaining why instead of raw results.
result = guarded_search.invoke("https://example.com/pricing")

For strict mode (block if no openterms.json exists):

guarded_search = OpenTermsGuard(
    tool=search,
    action="scrape_data",
    strict=True,  # Deny if openterms.json is absent
)

2. Give the agent a checker tool

OpenTermsChecker is a standalone tool the agent can call to check permissions before deciding what to do.

from langchain_openterms import OpenTermsChecker
from langchain.agents import AgentExecutor, create_openai_functions_agent

checker = OpenTermsChecker()

# Add checker to your agent's tool list
tools = [checker, your_other_tools...]

# The agent can now call:
#   openterms_check("github.com scrape_data")
# and get back a JSON result telling it whether scraping is allowed.

3. Passive logging with a callback

OpenTermsCallbackHandler observes tool invocations and logs permission checks without blocking anything. Useful for auditing.

from langchain_openterms import OpenTermsCallbackHandler

handler = OpenTermsCallbackHandler(
    default_action="read_content",
    on_check=lambda r: print(f"{r['domain']}: {r['allowed']}"),
)

result = agent.invoke(
    {"input": "Research pricing pages"},
    config={"callbacks": [handler]},
)

# After execution, inspect all checks:
for check in handler.checks:
    print(check["domain"], check["allowed"], check.get("receipt"))

With an existing agent

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_openterms import OpenTermsGuard, OpenTermsChecker

llm = ChatOpenAI(model="gpt-4o")

# Wrap your web tools
search = OpenTermsGuard(tool=DuckDuckGoSearchRun(), action="read_content")
checker = OpenTermsChecker()

prompt = ChatPromptTemplate.from_messages([
    ("system", (
        "You are a research assistant. Before interacting with any website, "
        "use the openterms_check tool to verify what you're allowed to do. "
        "Respect all permission denials."
    )),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

agent = create_openai_functions_agent(llm, [search, checker], prompt)
executor = AgentExecutor(agent=agent, tools=[search, checker])

result = executor.invoke({"input": "Find pricing info for Stripe's API"})

How it works

  1. Agent invokes a tool with a URL or domain reference
  2. The integration extracts the domain from the input
  3. Fetches https://{domain}/.well-known/openterms.json (cached for 1 hour)
  4. Checks the requested permission (e.g., read_content, scrape_data)
  5. If denied: returns a message explaining the denial
  6. If allowed or unspecified: tool executes normally
  7. Generates an ORS receipt (local, no server call) for audit logging

ORS Receipts

Every permission check can generate a receipt: a lightweight record of what was checked, the result, and a hash of the openterms.json content at the time. These are local objects your application can log however you choose.

from langchain_openterms import OpenTermsClient

client = OpenTermsClient()
result = client.check("example.com", "scrape_data")
receipt = client.receipt("example.com", "scrape_data", result)
# receipt = {
#     "domain": "example.com",
#     "action": "scrape_data",
#     "allowed": False,
#     "checked_at": "2026-04-11T...",
#     "openterms_hash": "a1b2c3..."
# }

Configuration

from langchain_openterms import OpenTermsClient

client = OpenTermsClient(
    cache_ttl=1800,  # Cache openterms.json for 30 minutes (default: 3600)
    timeout=10,      # HTTP timeout in seconds (default: 5)
)

# Pass to any integration component
guard = OpenTermsGuard(tool=my_tool, action="read_content", client=client)
checker = OpenTermsChecker(client=client)

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_openterms-0.1.0.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_openterms-0.1.0-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file langchain_openterms-0.1.0.tar.gz.

File metadata

  • Download URL: langchain_openterms-0.1.0.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for langchain_openterms-0.1.0.tar.gz
Algorithm Hash digest
SHA256 81d9c37dd806ac19f5bd94644329c4a419803f02d4e2fc5091ae43e51901a681
MD5 c8c1b9924977a21bd17dc8bb907b5fe2
BLAKE2b-256 03386dffc2c8d9ea25488e83c9297dd0d6ac65855f3e7769778e2f7825ed3ede

See more details on using hashes here.

File details

Details for the file langchain_openterms-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_openterms-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 120570df906bf4401d883f48bcfda43d1502fc48f1b434fef33beb134a11b042
MD5 2357d1c5633e0da25931007b82d9bf06
BLAKE2b-256 125d3e052a29fe467051a99126dfdf62fb456286aa70ce27ec3f6b4fbea4d2ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page