No project description provided
Project description
veryfi-python
veryfi is a Python SDK for communicating with the Veryfi OCR API.
Extract structured data from receipts, invoices, bank statements, checks, W-2s, W-8s, W-9s, business cards, and more — with a single function call.
Full API reference: veryfi.github.io/veryfi-python
Veryfi API docs: docs.veryfi.com
Table of Contents
- Installation
- Getting Started
- Supported APIs
- Error Handling
- Contributing
- Need Help?
- Changelog
- License
Installation
Install from PyPI using pip:
pip install -U veryfi
Requires Python 3.9 or later.
Getting Started
Obtaining credentials
If you don't have a Veryfi account, register at app.veryfi.com/signup/api/.
Initialize the client
from veryfi import Client
client = Client(
client_id="your_client_id",
client_secret="your_client_secret",
username="your_username",
api_key="your_api_key",
)
Optional constructor parameters:
| Parameter | Default | Description |
|---|---|---|
base_url |
https://api.veryfi.com/api/ |
Override the API base URL |
api_version |
v8 |
API version string |
timeout |
30 |
Request timeout in seconds |
Supported APIs
Documents (Receipts & Invoices)
Process a receipt or invoice from a local file:
response = client.process_document(
file_path="/tmp/receipt.jpg",
categories=["Meals & Entertainment", "Travel"],
)
Process from a URL:
response = client.process_document_url(
file_url="https://cdn.example.com/invoice.pdf",
categories=["Office Supplies"],
boost_mode=True,
external_id="my-ref-001",
max_pages_to_process=5,
)
The response contains the extracted fields. A typical result looks like:
{
"id": 933760836,
"created_date": "2024-08-15 15:56:56",
"date": "2022-05-24 13:10:00",
"vendor": {"name": "Walgreens", "address": "191 E 3rd Ave, San Mateo, CA 94401, US"},
"total": 29.53,
"subtotal": 27.60,
"tax": 1.93,
"currency_code": "USD",
"category": "Personal Care",
"payment": {"type": "visa", "card_number": "1850", "display_name": "Visa ***1850"},
"line_items": [
{"description": "RED BULL ENRGY DRNK CNS 8.4OZ 6PK", "total": 8.79, "quantity": 1.0},
{"description": "COCA COLA MINICAN 7.5Z 6PK", "total": 4.99, "quantity": 1.0},
# ...
],
"status": "processed",
}
Other document operations:
# List / search documents
documents = client.get_documents(q="Walgreens", created_date__gt="2024-01-01+00:00:00")
# Get a single document by ID
document = client.get_document(document_id=933760836)
# Update fields on a document
client.update_document(
document_id=933760836,
vendor={"name": "Starbucks", "address": "123 Easy St, San Francisco, CA 94158"},
category="Meals & Entertainment",
total=11.23,
)
# Delete a document
client.delete_document(document_id=933760836)
Line items
items = client.get_line_items(document_id=933760836)
client.add_line_item(document_id=933760836, payload={"description": "Extra item", "total": 5.00})
client.update_line_item(document_id=933760836, line_item_id=101, payload={"total": 6.00})
client.delete_line_item(document_id=933760836, line_item_id=101)
Tags
client.add_tag(document_id=933760836, tag_name="reimbursable")
client.add_tags(document_id=933760836, tags=["q1", "travel"])
client.get_tags(document_id=933760836)
client.delete_tags(document_id=933760836)
Split & process a multi-page PDF
response = client.split_and_process_pdf(file_path="/tmp/multi.pdf")
response = client.split_and_process_pdf_url(file_url="https://cdn.example.com/multi.pdf")
Bank Statements
Process a bank statement and extract transactions, balances, and account details:
# From a local file
response = client.process_bank_statement_document(
file_path="/tmp/statement.pdf",
categories=["Transfer", "Credit Card Payments", "Restaurants / Dining / Meals"],
)
# From a URL
response = client.process_bank_statement_document_url(
file_url="https://cdn.example.com/statement.pdf",
categories=["ATM Deposit", "Interest / Dividends", "Mortgage Payments"],
)
The categories parameter is an optional list of strings used to classify transactions. When provided, the API maps each transaction to the closest matching category.
# List statements
statements = client.get_bank_statements(
created_date__gt="2024-01-01+00:00:00",
created_date__lte="2024-12-31+23:59:59",
)
# Get a single statement
statement = client.get_bank_statement(document_id=4559568)
# Delete
client.delete_bank_statement(document_id=4559568)
Checks
# Process from file
response = client.process_check(file_path="/tmp/check.jpg")
# Process from URL
response = client.process_check_url(file_url="https://cdn.example.com/check.jpg")
# Check with remittance
response = client.process_check_with_remittance(file_path="/tmp/check_remittance.pdf")
response = client.process_check_with_remittance_url(file_url="https://cdn.example.com/check.pdf")
# List, get, update, delete
checks = client.get_checks(created_date__gt="2024-01-01+00:00:00")
check = client.get_check(document_id=12345)
client.update_check(document_id=12345, status="cleared")
client.delete_check(document_id=12345)
Business Cards
response = client.process_bussines_card_document(file_path="/tmp/card.jpg")
response = client.process_bussines_card_document_url(file_url="https://cdn.example.com/card.jpg")
cards = client.get_business_cards()
card = client.get_business_card(document_id=67890)
client.delete_business_card(document_id=67890)
W-2 Forms
response = client.process_w2_document(file_path="/tmp/w2.pdf")
response = client.process_w2_document_url(file_url="https://cdn.example.com/w2.pdf")
w2s = client.get_w2s(created_date_gt="2024-01-01+00:00:00")
w2 = client.get_w2(document_id=11111)
client.delete_w2(document_id=11111)
# Split & process a multi-W-2 PDF
response = client.split_and_process_w2(file_path="/tmp/multi_w2.pdf")
response = client.split_and_process_w2_url(file_url="https://cdn.example.com/multi_w2.pdf")
W-8 Forms
response = client.process_w8_document(file_path="/tmp/w8.pdf")
response = client.process_w8_document_url(file_url="https://cdn.example.com/w8.pdf")
w8s = client.get_w8s()
w8 = client.get_w8(document_id=22222)
client.delete_w8(document_id=22222)
W-9 Forms
response = client.process_w9_document(file_path="/tmp/w9.pdf")
response = client.process_w9_document_url(file_url="https://cdn.example.com/w9.pdf")
w9s = client.get_w9s()
w9 = client.get_w9(document_id=33333)
client.delete_w9(document_id=33333)
Any Document
Use a custom blueprint to extract fields from any document type:
response = client.process_any_document(
blueprint_name="my_custom_blueprint",
file_path="/tmp/custom_doc.pdf",
)
response = client.process_any_document_url(
blueprint_name="my_custom_blueprint",
file_url="https://cdn.example.com/custom_doc.pdf",
)
docs = client.get_any_documents(created_date__gt="2024-01-01+00:00:00")
doc = client.get_any_document(document_id=44444)
client.delete_any_document(document_id=44444)
Classify
Classify a document to determine its type before processing:
response = client.classify_document(
file_path="/tmp/unknown.pdf",
document_types=["receipt", "invoice", "bank_statement"],
)
response = client.classify_document_url(
file_url="https://cdn.example.com/unknown.pdf",
document_types=["w2", "w9"],
)
Error Handling
All API errors raise a VeryfiClientError (or a more specific subclass). Import the exceptions you need:
from veryfi.errors import (
VeryfiClientError,
UnauthorizedAccessToken,
BadRequest,
ResourceNotFound,
AccessLimitReached,
)
try:
response = client.process_document(file_path="/tmp/receipt.jpg")
except UnauthorizedAccessToken:
print("Check your client_id, username, and api_key.")
except ResourceNotFound:
print("The requested document does not exist.")
except AccessLimitReached:
print("API rate limit reached. Please wait before retrying.")
except BadRequest as e:
print(f"Bad request: {e}")
except VeryfiClientError as e:
print(f"Unexpected error (HTTP {e.status}): {e}")
| Exception | HTTP status | Cause |
|---|---|---|
UnauthorizedAccessToken |
401 | Invalid or missing credentials |
BadRequest |
400 | Malformed request or missing required fields |
ResourceNotFound |
404 | Document ID does not exist |
UnexpectedHTTPMethod |
405 | Wrong HTTP method used |
AccessLimitReached |
409 | Rate limit exceeded |
InternalError |
500 | Server-side error |
ServiceUnavailable |
503 | Veryfi service is temporarily down |
Contributing
Contributions are welcome! To get started:
- Fork the repository and create your branch from
master. - Install development dependencies:
pip install -r requirements.txt
pip install black pytest responses tox
- Make your changes, then run the test suite:
# Run all tests
pytest
# Run tests across all supported Python versions (3.9–3.12)
tox
# Check code formatting
black --check .
# Auto-format
black .
- Open a pull request against
master.
All pull requests must pass the CI checks (tests + black formatting) before merging.
Need Help?
- API documentation: docs.veryfi.com
- SDK reference: veryfi.github.io/veryfi-python
- Support: support@veryfi.com
- Bug reports / feature requests: open an issue
To learn more about Veryfi visit veryfi.com.
Tutorial Video
Changelog
See NEWS.md for a history of changes, or browse the GitHub Releases page.
License
MIT © Veryfi, Inc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file veryfi-5.0.1.tar.gz.
File metadata
- Download URL: veryfi-5.0.1.tar.gz
- Upload date:
- Size: 2.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
614be99b0a0f2db97055573694645f850690a358034c80589224675c4669d42f
|
|
| MD5 |
c7020a58c2cc9eb42a0a3a9dfeb03d43
|
|
| BLAKE2b-256 |
9456a7228af77fb0c27266a78b58df6946c190463a31ab39a6952f105719b59e
|
File details
Details for the file veryfi-5.0.1-py3-none-any.whl.
File metadata
- Download URL: veryfi-5.0.1-py3-none-any.whl
- Upload date:
- Size: 25.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b927da9443310f4984329fd82c646707959e87890fd9715385dc103ad7acaaa6
|
|
| MD5 |
b72c0bbb72ed59fc3c6790f418ade0af
|
|
| BLAKE2b-256 |
53129b4a84531fd3e5dce20a06e1a5b8def9211a521ced5487deff9ef7191a39
|