Python SDK for the Polyvia document intelligence API and MCP server
Project description
Polyvia Python SDK
Official Python SDK for the Polyvia AI platform.
from polyvia import Polyvia
client = Polyvia(api_key="poly_<key>")
# Ingest → wait → query
result = client.ingest.file("report.pdf", name="Q4 Report")
client.ingest.wait(result.task_id)
print(client.query("What are the key findings?").answer)
Table of Contents
- Installation
- Authentication
- REST API
- MCP Server
- Agent Tools (programmatic)
- Async Client
- Error Handling
- Development
Installation
pip install polyvia
LangChain agent support:
pip install "polyvia[langchain]"
Requires Python 3.9+.
Authentication
Generate an API key at app.polyvia.ai → Settings → API.
All keys start with poly_.
# Pass explicitly
client = Polyvia(api_key="poly_<key>")
# Or set the environment variable and omit the argument
# export POLYVIA_API_KEY=poly_<key>
client = Polyvia()
Workspace scoping. Each key is permanently bound to the workspace (personal or one organization) you were in when you minted it. The key sees only that workspace's documents, groups, and chats — switching the active workspace in the UI later doesn't change a key's scope. Mint separate keys for each workspace you need to script against.
REST API
Ingest
# Single file — pass the group **name** (created if it doesn't exist yet).
# Returns immediately with a task_id to poll.
result = client.ingest.file("report.pdf", name="Q4 Report", group="Finance")
# IngestResult(document_id='<id>', task_id='<id>', status='pending')
# Multiple files in one request (the group is resolved once for the batch)
batch = client.ingest.batch(
["q3.pdf", "q4.pdf"],
names=["Q3 Report", "Q4 Report"],
group="Finance",
)
# Check status
status = client.ingest.status(result.task_id)
# IngestionStatus(status='parsing', ...)
# Block until done — raises IngestionError on failure, IngestionTimeout on timeout
done = client.ingest.wait(result.task_id, poll_interval=5, timeout=300)
Query
# All completed documents
answer = client.query("What risks are mentioned across all reports?")
# Single document (fastest)
answer = client.query("Summarise section 3.", document_id="doc_<id>")
# Scoped to a group — by name (the group must already exist)
answer = client.query("Key findings?", group="Finance")
# Scoped to multiple groups — by id
answer = client.query("Compare results.", group_ids=["g_<id>", "g_<id>"])
print(answer.answer)
Groups
Groups have a human name and an opaque backend id. You rarely need the
id — pass the name straight to ingest / query (above) and the SDK resolves
it. When you do want the group object, get_or_create is the easy way in.
# Idempotent: returns the existing "Finance" group, or creates it. Matched by
# exact name, so it never makes duplicates. Returns a Group.
group = client.groups.get_or_create("Finance")
group.id # the backend id, if you ever need it
group.name # "Finance"
# Look one up without creating it (returns None if there isn't one)
existing = client.groups.find("Finance")
# List
for g in client.groups.list():
print(g.name, g.id, g.color)
# Delete all documents in a group, then the group itself
client.groups.delete(group.id, delete_documents=True)
# Or separately
client.groups.delete_documents(group.id) # wipe documents, keep group
client.groups.delete(group.id) # remove empty group
# create() always makes a NEW group, even if the name exists — prefer
# get_or_create() unless you specifically want a fresh one each time.
Documents
# List — filter by status and/or group
docs = client.documents.list(status="completed", group_id="g_<id>")
docs = client.documents.list(group_ids=["g_<id>", "g_<id>"])
# Get one
doc = client.documents.get("doc_<id>")
# Move to a different group / remove from group
client.documents.update("doc_<id>", group_id="g_other")
client.documents.update("doc_<id>", group_id=None)
# Delete
client.documents.delete("doc_<id>")
Usage & Rate Limits
usage = client.usage()
print(usage.usage.requests.period) # requests this calendar month
print(usage.usage.requests.total) # all-time
print(usage.usage.documents_stored) # live document count
limits = client.rate_limits()
print(limits.limits["requests_per_minute"])
print(limits.current["remaining_this_minute"])
print(limits.resets_at.month) # ISO timestamp of next monthly reset
MCP Server
Polyvia runs a hosted Model Context Protocol server at
https://app.polyvia.ai/mcp. Connect your AI client once and it can ingest, search,
and query documents without any manual tool-dispatch code.
Using Claude Code? Add the server with one command:
claude mcp add --transport http polyvia https://app.polyvia.ai/mcp \
--header "Authorization: Bearer poly_<key>"
client.mcp returns an MCPConfig object with a helper for every major client:
| Method | Use with |
|---|---|
claude_code_command() |
The claude mcp add … command line above |
to_anthropic_mcp_server() |
ant.beta.messages.create(mcp_servers=[...]) |
to_openai_responses_tool() |
oai.responses.create(tools=[...]) |
to_openai_mcp_server() |
OpenAI Agents SDK MCPServerStreamableHTTP |
to_claude_desktop_config() |
~/.claude/claude_desktop_config.json |
Anthropic beta MCP client
from anthropic import Anthropic
from polyvia import Polyvia
polyvia = Polyvia(api_key="poly_<key>")
ant = Anthropic()
response = ant.beta.messages.create(
model="claude-opus-4-5",
max_tokens=1000,
messages=[{"role": "user", "content": "What are my Q4 findings?"}],
mcp_servers=[polyvia.mcp.to_anthropic_mcp_server()],
betas=["mcp-client-2025-04-04"],
)
print(response.content[0].text)
to_anthropic_mcp_server() produces:
{
"type": "url",
"url": "https://app.polyvia.ai/mcp",
"name": "polyvia", # customise with name="my-docs"
"headers": {"Authorization": "Bearer poly_<key>"},
}
OpenAI Responses API
from openai import OpenAI
from polyvia import Polyvia
polyvia = Polyvia(api_key="poly_<key>")
oai = OpenAI()
response = oai.responses.create(
model="gpt-4o",
tools=[polyvia.mcp.to_openai_responses_tool()],
input="What are my Q4 findings?",
)
print(response.output_text)
to_openai_responses_tool() produces:
{
"type": "mcp",
"server_label": "polyvia", # customise with server_label="my-docs"
"server_url": "https://app.polyvia.ai/mcp",
"headers": {"Authorization": "Bearer poly_<key>"},
"require_approval": "never", # or "always" to confirm each call
}
OpenAI Agents SDK
from agents import Agent, Runner
from agents.mcp import MCPServerStreamableHTTP
from polyvia import Polyvia
polyvia = Polyvia(api_key="poly_<key>")
cfg = polyvia.mcp.to_openai_mcp_server()
server = MCPServerStreamableHTTP(url=cfg["url"], headers=cfg["headers"])
agent = Agent(name="Research", mcp_servers=[server])
result = Runner.run_sync(agent, "What do my Q4 reports say about revenue?")
print(result.final_output)
Claude Desktop
# Print a snippet to copy-paste into ~/.claude/claude_desktop_config.json
client.mcp.print_claude_desktop_snippet()
Or wire it up programmatically:
import json, pathlib
cfg_path = pathlib.Path.home() / ".claude" / "claude_desktop_config.json"
config = json.loads(cfg_path.read_text()) if cfg_path.exists() else {}
config.setdefault("mcpServers", {})["polyvia"] = client.mcp.to_claude_desktop_config()
cfg_path.write_text(json.dumps(config, indent=2))
print("Restart Claude Desktop to activate.")
to_claude_desktop_config() produces:
{
"type": "http",
"url": "https://app.polyvia.ai/mcp",
"headers": { "Authorization": "Bearer poly_<key>" }
}
Agent Tools (programmatic)
If you'd rather manage the tool-dispatch loop yourself — or your framework
doesn't support remote MCP — use client.tools to get JSON-schema tool
definitions and an executor that calls the REST API directly.
All 10 Polyvia tools are included: ingest, status, list/get/update/delete documents, list/create/delete groups, and query.
OpenAI ChatCompletion
import json
from openai import OpenAI
from polyvia import Polyvia
client = Polyvia(api_key="poly_<key>")
oai = OpenAI()
tools, call = client.tools.openai()
response = oai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What are my Q4 findings?"}],
tools=tools,
)
for tc in response.choices[0].message.tool_calls or []:
result = call(tc.function.name, json.loads(tc.function.arguments))
print(result)
Anthropic Messages API
import anthropic
from polyvia import Polyvia
client = Polyvia(api_key="poly_<key>")
ant = anthropic.Anthropic()
tools, call = client.tools.anthropic()
response = ant.messages.create(
model="claude-opus-4-5",
max_tokens=2048,
messages=[{"role": "user", "content": "Summarise my Finance documents."}],
tools=tools,
)
for block in response.content:
if block.type == "tool_use":
result = call(block.name, block.input)
print(result)
LangChain
Requires pip install "polyvia[langchain]".
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from polyvia import Polyvia
client = Polyvia(api_key="poly_<key>")
tools = client.tools.langchain()
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant with access to a document workspace."),
("user", "{input}"),
("placeholder", "{agent_scratchpad}"),
])
agent = create_tool_calling_agent(ChatOpenAI(model="gpt-4o"), tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
executor.invoke({"input": "What risks are mentioned in my reports?"})
Async Client
Every method on AsyncPolyvia is a coroutine — same API surface as the sync client.
import asyncio
from polyvia import AsyncPolyvia
async def main():
async with AsyncPolyvia(api_key="poly_<key>") as client:
result = await client.ingest.file("report.pdf")
await client.ingest.wait(result.task_id)
answer = await client.query("Key findings?")
print(answer.answer)
asyncio.run(main())
Error Handling
from polyvia import (
AuthenticationError, # 401 — bad or missing API key
ForbiddenError, # 403 — document belongs to another user
NotFoundError, # 404 — document, group, or task not found
RateLimitError, # 429 — too many requests
IngestionError, # task finished with status='failed'
IngestionTimeout, # ingest.wait() exceeded its timeout
)
try:
done = client.ingest.wait(task_id, timeout=60)
except IngestionError as e:
print(f"Parsing failed: {e.error}")
except IngestionTimeout:
print("Timed out — document may still be processing")
except RateLimitError:
print("Rate limit hit — back off and retry")
except NotFoundError:
print("Document or task not found")
except AuthenticationError:
print("Invalid API key")
Development
git clone https://github.com/polyvia-ai/polyvia-python
cd polyvia-python
pip install -e ".[dev]"
pytest
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polyvia-0.4.0.tar.gz.
File metadata
- Download URL: polyvia-0.4.0.tar.gz
- Upload date:
- Size: 23.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0bdac5997ab2590d5e0cd52c2334d389692d44012ec5ee2d0266366a620fb1a3
|
|
| MD5 |
e1fe613135d8a1060f63d28282a9d53d
|
|
| BLAKE2b-256 |
9c0f7ed5d7244932a38030c2758d8119e2156d98c0fbd08251adbf54f624bcb4
|
Provenance
The following attestation bundles were made for polyvia-0.4.0.tar.gz:
Publisher:
publish.yml on polyvia-ai/polyvia-sdk-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polyvia-0.4.0.tar.gz -
Subject digest:
0bdac5997ab2590d5e0cd52c2334d389692d44012ec5ee2d0266366a620fb1a3 - Sigstore transparency entry: 1790443537
- Sigstore integration time:
-
Permalink:
polyvia-ai/polyvia-sdk-python@7c1d5d81bd5c0062caaf7731c3ece8c631f32f5c -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/polyvia-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7c1d5d81bd5c0062caaf7731c3ece8c631f32f5c -
Trigger Event:
push
-
Statement type:
File details
Details for the file polyvia-0.4.0-py3-none-any.whl.
File metadata
- Download URL: polyvia-0.4.0-py3-none-any.whl
- Upload date:
- Size: 22.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2b9374f7d076052132bbeb0531f379b3f01538cc168f12bfab6e630d08c7ded
|
|
| MD5 |
2e3a29c5fdb54323e5c3e1e964b77b38
|
|
| BLAKE2b-256 |
3d6c76c0279649fdc5e02fd4db6f21fbf1120cbccc9a190cf07ee4b7bcd077aa
|
Provenance
The following attestation bundles were made for polyvia-0.4.0-py3-none-any.whl:
Publisher:
publish.yml on polyvia-ai/polyvia-sdk-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polyvia-0.4.0-py3-none-any.whl -
Subject digest:
a2b9374f7d076052132bbeb0531f379b3f01538cc168f12bfab6e630d08c7ded - Sigstore transparency entry: 1790443561
- Sigstore integration time:
-
Permalink:
polyvia-ai/polyvia-sdk-python@7c1d5d81bd5c0062caaf7731c3ece8c631f32f5c -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/polyvia-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7c1d5d81bd5c0062caaf7731c3ece8c631f32f5c -
Trigger Event:
push
-
Statement type: