No project description provided
Project description
veryfi-python
veryfi is a Python SDK for communicating with the Veryfi OCR API.
Extract structured data from receipts, invoices, bank statements, checks, W-2s, W-8s, W-9s, business cards, and more — with a single function call.
Full API reference: veryfi.github.io/veryfi-python
Veryfi API docs: docs.veryfi.com
Table of Contents
- Installation
- Getting Started
- Supported APIs
- Error Handling
- Command-line interface
- Contributing
- Need Help?
- Changelog
- License
Installation
Install from PyPI using pip:
pip install -U veryfi
Requires Python 3.9 or later.
Getting Started
Obtaining credentials
If you don't have a Veryfi account, register at app.veryfi.com/signup/api/.
Initialize the client
from veryfi import Client
client = Client(
client_id="your_client_id",
client_secret="your_client_secret",
username="your_username",
api_key="your_api_key",
)
Optional constructor parameters:
| Parameter | Default | Description |
|---|---|---|
base_url |
https://api.veryfi.com/api/ |
Override the API base URL |
api_version |
v8 |
API version string |
timeout |
30 |
Request timeout in seconds |
Supported APIs
Documents (Receipts & Invoices)
Process a receipt or invoice from a local file:
response = client.process_document(
file_path="/tmp/receipt.jpg",
categories=["Meals & Entertainment", "Travel"],
)
Process from a URL:
response = client.process_document_url(
file_url="https://cdn.example.com/invoice.pdf",
categories=["Office Supplies"],
boost_mode=True,
external_id="my-ref-001",
max_pages_to_process=5,
)
The response contains the extracted fields. A typical result looks like:
{
"id": 933760836,
"created_date": "2024-08-15 15:56:56",
"date": "2022-05-24 13:10:00",
"vendor": {"name": "Walgreens", "address": "191 E 3rd Ave, San Mateo, CA 94401, US"},
"total": 29.53,
"subtotal": 27.60,
"tax": 1.93,
"currency_code": "USD",
"category": "Personal Care",
"payment": {"type": "visa", "card_number": "1850", "display_name": "Visa ***1850"},
"line_items": [
{"description": "RED BULL ENRGY DRNK CNS 8.4OZ 6PK", "total": 8.79, "quantity": 1.0},
{"description": "COCA COLA MINICAN 7.5Z 6PK", "total": 4.99, "quantity": 1.0},
# ...
],
"status": "processed",
}
Other document operations:
# List / search documents
documents = client.get_documents(q="Walgreens", created_date__gt="2024-01-01+00:00:00")
# Get a single document by ID
document = client.get_document(document_id=933760836)
# Update fields on a document
client.update_document(
document_id=933760836,
vendor={"name": "Starbucks", "address": "123 Easy St, San Francisco, CA 94158"},
category="Meals & Entertainment",
total=11.23,
)
# Delete a document
client.delete_document(document_id=933760836)
Line items
items = client.get_line_items(document_id=933760836)
client.add_line_item(document_id=933760836, payload={"description": "Extra item", "total": 5.00})
client.update_line_item(document_id=933760836, line_item_id=101, payload={"total": 6.00})
client.delete_line_item(document_id=933760836, line_item_id=101)
Tags
client.add_tag(document_id=933760836, tag_name="reimbursable")
client.add_tags(document_id=933760836, tags=["q1", "travel"])
client.get_tags(document_id=933760836)
client.delete_tags(document_id=933760836)
Split & process a multi-page PDF
response = client.split_and_process_pdf(file_path="/tmp/multi.pdf")
response = client.split_and_process_pdf_url(file_url="https://cdn.example.com/multi.pdf")
Bank Statements
Process a bank statement and extract transactions, balances, and account details:
# From a local file
response = client.process_bank_statement_document(
file_path="/tmp/statement.pdf",
categories=["Transfer", "Credit Card Payments", "Restaurants / Dining / Meals"],
)
# From a URL
response = client.process_bank_statement_document_url(
file_url="https://cdn.example.com/statement.pdf",
categories=["ATM Deposit", "Interest / Dividends", "Mortgage Payments"],
)
The categories parameter is an optional list of strings used to classify transactions. When provided, the API maps each transaction to the closest matching category.
# List statements
statements = client.get_bank_statements(
created_date__gt="2024-01-01+00:00:00",
created_date__lte="2024-12-31+23:59:59",
)
# Get a single statement
statement = client.get_bank_statement(document_id=4559568)
# Delete
client.delete_bank_statement(document_id=4559568)
Checks
# Process from file
response = client.process_check(file_path="/tmp/check.jpg")
# Process from URL
response = client.process_check_url(file_url="https://cdn.example.com/check.jpg")
# Check with remittance
response = client.process_check_with_remittance(file_path="/tmp/check_remittance.pdf")
response = client.process_check_with_remittance_url(file_url="https://cdn.example.com/check.pdf")
# List, get, update, delete
checks = client.get_checks(created_date__gt="2024-01-01+00:00:00")
check = client.get_check(document_id=12345)
client.update_check(document_id=12345, status="cleared")
client.delete_check(document_id=12345)
Business Cards
response = client.process_bussines_card_document(file_path="/tmp/card.jpg")
response = client.process_bussines_card_document_url(file_url="https://cdn.example.com/card.jpg")
cards = client.get_business_cards()
card = client.get_business_card(document_id=67890)
client.delete_business_card(document_id=67890)
W-2 Forms
response = client.process_w2_document(file_path="/tmp/w2.pdf")
response = client.process_w2_document_url(file_url="https://cdn.example.com/w2.pdf")
w2s = client.get_w2s(created_date_gt="2024-01-01+00:00:00")
w2 = client.get_w2(document_id=11111)
client.delete_w2(document_id=11111)
# Split & process a multi-W-2 PDF
response = client.split_and_process_w2(file_path="/tmp/multi_w2.pdf")
response = client.split_and_process_w2_url(file_url="https://cdn.example.com/multi_w2.pdf")
W-8 Forms
response = client.process_w8_document(file_path="/tmp/w8.pdf")
response = client.process_w8_document_url(file_url="https://cdn.example.com/w8.pdf")
w8s = client.get_w8s()
w8 = client.get_w8(document_id=22222)
client.delete_w8(document_id=22222)
W-9 Forms
response = client.process_w9_document(file_path="/tmp/w9.pdf")
response = client.process_w9_document_url(file_url="https://cdn.example.com/w9.pdf")
w9s = client.get_w9s()
w9 = client.get_w9(document_id=33333)
client.delete_w9(document_id=33333)
Any Document
Use a custom blueprint to extract fields from any document type:
response = client.process_any_document(
blueprint_name="my_custom_blueprint",
file_path="/tmp/custom_doc.pdf",
)
response = client.process_any_document_url(
blueprint_name="my_custom_blueprint",
file_url="https://cdn.example.com/custom_doc.pdf",
)
docs = client.get_any_documents(created_date__gt="2024-01-01+00:00:00")
doc = client.get_any_document(document_id=44444)
client.delete_any_document(document_id=44444)
Classify
Classify a document to determine its type before processing:
response = client.classify_document(
file_path="/tmp/unknown.pdf",
document_types=["receipt", "invoice", "bank_statement"],
)
response = client.classify_document_url(
file_url="https://cdn.example.com/unknown.pdf",
document_types=["w2", "w9"],
)
Error Handling
All API errors raise a VeryfiClientError (or a more specific subclass). Import the exceptions you need:
from veryfi.errors import (
VeryfiClientError,
UnauthorizedAccessToken,
BadRequest,
ResourceNotFound,
AccessLimitReached,
)
try:
response = client.process_document(file_path="/tmp/receipt.jpg")
except UnauthorizedAccessToken:
print("Check your client_id, username, and api_key.")
except ResourceNotFound:
print("The requested document does not exist.")
except AccessLimitReached:
print("API rate limit reached. Please wait before retrying.")
except BadRequest as e:
print(f"Bad request: {e}")
except VeryfiClientError as e:
print(f"Unexpected error (HTTP {e.status}): {e}")
| Exception | HTTP status | Cause |
|---|---|---|
UnauthorizedAccessToken |
401 | Invalid or missing credentials |
BadRequest |
400 | Malformed request or missing required fields |
ResourceNotFound |
404 | Document ID does not exist |
UnexpectedHTTPMethod |
405 | Wrong HTTP method used |
AccessLimitReached |
409 | Rate limit exceeded |
InternalError |
500 | Server-side error |
ServiceUnavailable |
503 | Veryfi service is temporarily down |
Command-line interface
Installing veryfi also installs a veryfi console script (and the equivalent python -m veryfi). The CLI is a thin wrapper around the Python Client and exposes every supported resource as a sub-command — designed for shell users and AI agents that drive the SDK from a terminal.
Verify the install:
veryfi --help
# or, equivalently:
python -m veryfi --help
Authentication
Credentials are read from environment variables (preferred for agents) or equivalent flags:
| Env var | Flag | Description |
|---|---|---|
VERYFI_CLIENT_ID |
--client-id |
Required |
VERYFI_CLIENT_SECRET |
--client-secret |
Optional — enables HMAC request signing |
VERYFI_USERNAME |
--username |
Required |
VERYFI_API_KEY |
--api-key |
Required |
VERYFI_BASE_URL |
--base-url |
Optional, defaults to https://api.veryfi.com/api/ |
VERYFI_API_VERSION |
--api-version |
Optional, defaults to v8 |
VERYFI_TIMEOUT |
--timeout |
Optional, defaults to 30 seconds |
If any required credential is missing the CLI exits with code 2 and a JSON error on stderr.
Quick examples
export VERYFI_CLIENT_ID=... VERYFI_USERNAME=... VERYFI_API_KEY=...
# Optional:
export VERYFI_CLIENT_SECRET=...
# Documents
veryfi documents process --file /tmp/receipt.jpg --category Travel --category Meals
veryfi documents process-url --file-url https://cdn.example.com/x.pdf --boost-mode --external-id ref-1
veryfi documents list --q Walgreens --created-gt 2024-01-01+00:00:00
veryfi documents get 933760836
veryfi documents update 933760836 --field category="Meals & Entertainment" --field total=11.23
veryfi documents delete 933760836
# Nested line-items / tags
veryfi documents line-items add 933760836 --field description="Extra item" --field total=5.0
veryfi documents tags add-many 933760836 --tag q1 --tag travel
# Multi-page PDF splitting
veryfi documents set split --file /tmp/multi.pdf
veryfi documents set split-url --file-url https://cdn.example.com/multi.pdf --max-pages 5
# Other resources
veryfi bank-statements process --file /tmp/stmt.pdf --category Transfer
veryfi checks process-with-remittance --file /tmp/check.pdf
veryfi business-cards process-url --file-url https://cdn.example.com/card.jpg
veryfi w2s process --file /tmp/w2.pdf
veryfi w2s set split --file /tmp/multi_w2.pdf
veryfi w8s list --created-gt 2024-01-01+00:00:00
veryfi w9s get 33333
veryfi any-docs process --blueprint my_blueprint --file /tmp/custom.pdf
veryfi classify file --file /tmp/unknown.pdf --document-type receipt --document-type invoice
You can also pipe binary file data via stdin by passing --file -:
curl -s https://cdn.example.com/r.jpg | veryfi documents process --file -
Output and exit codes
Every command emits a JSON response on stdout. Use --output raw for single-line JSON (handy for piping into jq) or --output pretty for sorted keys. Errors are emitted as JSON on stderr and the process exits with a non-zero status:
| Exit code | Meaning |
|---|---|
0 |
Success |
2 |
Missing credentials or invalid CLI arguments |
1-255 |
Veryfi API error — exit code is the HTTP status (clipped to 255) |
70 |
Unexpected error (treat as a bug) |
The exact HTTP status is always included in the stderr payload, e.g.:
{
"error": "Document not found",
"status": 404,
"exception": "ResourceNotFound"
}
Passing arbitrary fields
For endpoints that accept **kwargs (e.g. update_document, add_line_item, update_check), use repeatable --field KEY=VALUE flags or --json-body '<json>'. --field values are JSON-decoded when possible (so total=11.23 becomes a number, enabled=true becomes a boolean, data='{"a":1}' becomes an object) and fall back to plain strings.
Discovery
Every command at every level supports --help, which lists subcommands or options with their descriptions:
veryfi --help # top-level: lists all resource groups
veryfi documents --help # group: lists process, list, get, tags, line-items, set, …
veryfi documents process --help # leaf: lists every flag with its description
For AI agents and tooling that prefer a machine-readable contract, veryfi schema emits a JSON manifest of every command, its description, and every parameter (name, type, required, repeatable). Agents can ingest this once to register Veryfi as a tool surface without parsing --help text:
veryfi schema | jq '.commands[] | {name, help}'
Contributing
Contributions are welcome! To get started:
- Fork the repository and create your branch from
master. - Install development dependencies:
pip install -r requirements.txt
pip install black pytest responses tox
requirements.txt already includes typer, which is required for the veryfi CLI and its tests.
- Make your changes, then run the test suite:
# Run all tests
pytest
# Run tests across all supported Python versions (3.9–3.12)
tox
# Check code formatting
black --check .
# Auto-format
black .
- Open a pull request against
master.
All pull requests must pass the CI checks (tests + black formatting) before merging.
Need Help?
- API documentation: docs.veryfi.com
- SDK reference: veryfi.github.io/veryfi-python
- Support: support@veryfi.com
- Bug reports / feature requests: open an issue
To learn more about Veryfi visit veryfi.com.
Tutorial Video
Changelog
See NEWS.md for a history of changes, or browse the GitHub Releases page.
License
MIT © Veryfi, Inc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file veryfi-5.1.0.tar.gz.
File metadata
- Download URL: veryfi-5.1.0.tar.gz
- Upload date:
- Size: 2.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d3d634e5712a10094ecf7af84177e7b0a5129b9fd3cf2fc9c46e756413e434f7
|
|
| MD5 |
0225d8468a9fe5b619198b8ec48d680f
|
|
| BLAKE2b-256 |
9b4c16fd39d7206eea8f0cdbdf8ab40f71b0489779c56f3e58d21ee4de34e574
|
File details
Details for the file veryfi-5.1.0-py3-none-any.whl.
File metadata
- Download URL: veryfi-5.1.0-py3-none-any.whl
- Upload date:
- Size: 44.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3f3a1a1a1f00b915bedc772b1152b0fdfea0afb42d8a414ada4e7fc192bb8f9
|
|
| MD5 |
b2ebbc040b6916d6c13d1485274fe70d
|
|
| BLAKE2b-256 |
1c3552847fad03d6ab53980767f9108158136346bb4e58e0846724ddde1942c3
|