Skip to main content

A Python package to interact with DocuProx API

Project description

DocuProx Package

A Python package to interact with the DocuProx API for processing documents using templates.

Installation

pip install docuprox

Configuration

Create a .env file in your project root with your API credentials:

DOCUPROX_API_URL=https://api.docuprox.com/v1
DOCUPROX_API_KEY=your-api-key-here

Or set environment variables directly:

export DOCUPROX_API_URL=https://api.docuprox.com/v1
export DOCUPROX_API_KEY=your-api-key-here

Usage

from docuprox import Docuprox

# Initialize the client (API key required, can be set via DOCUPROX_API_KEY env var)
client = Docuprox(api_key="your-api-key-here")  # Uses default URL: https://api.docuprox.com/v1

# Or set custom URL and API key
client = Docuprox(api_url="https://your-custom-api.com/v1", api_key="your-api-key-here")

# Or use environment variables (recommended for production)
# Set DOCUPROX_API_URL and DOCUPROX_API_KEY environment variables
client = Docuprox()  # Will use env vars or defaults

# Process a file with a template (sends as multipart/form-data)
template_id = "your-template-uuid-here"
result = client.processfile("path/to/your/file.pdf", template_id)
print(result)

# Process base64 data with a template (sends as JSON)
base64_string = "your_base64_encoded_data_here"
result = client.processbase64(base64_string, template_id)
print(result)

# Process with static values (optional key-value pairs)
# Note: Static values will be returned in the response. By default, the value will be
# the value set in the UI. If you provide static_values, it will override the UI defaults.
static_values = {
    "company_name": "Acme Corp",
    "invoice_number": "INV-2024-001"
}
result = client.processfile("path/to/your/file.pdf", template_id, static_values=static_values)
print(result)

Batch Processing with Zip Files

You can process multiple documents at once by uploading a zip file. All images and PDFs must be placed in the root of the zip file (not in subdirectories).

Zip File Structure:

documents.zip
├── invoice1.pdf
├── invoice2.pdf
├── receipt1.jpg
├── receipt2.png
└── document.pdf

✓ Correct - All files in root

documents.zip
└── folder/
    ├── invoice1.pdf
    └── invoice2.pdf

✗ Incorrect - Files in subdirectory

Usage:

# Process a zip file containing multiple documents
result = client.processfile("documents.zip", template_id)

# Or with static values
result = client.processfile(
    "documents.zip",
    template_id,
    static_values={"batch_id": "BATCH-001"}
)

Important:

  • All image files (JPG, PNG, etc.) and PDF files must be in the root directory of the zip
  • Files in subdirectories will be ignored
  • Supported formats: PDF, JPG, JPEG, PNG, TIFF, etc.

Static Values

Static values allow you to pass predefined key-value pairs to the processing API. These values will be included in the returned response.

Priority:

  • Default: If no static_values are provided, the response will include values set in the template UI
  • Override: If you provide static_values parameter, these values will override the UI defaults

Example:

Let's say in the UI, company_name is set to "Docuprox" as the default value.

# Without static_values - uses UI default
result = client.processfile("file.pdf", template_id)
# Response will include: company_name = "Docuprox" (from UI)

# With static_values - overrides UI default
static_values = {
    "company_name": "Acme Corp",  # Overrides "Docuprox" from UI
    "invoice_number": "INV-2024-001",
    "date": "2024-01-15"
}
result = client.processfile("file.pdf", template_id, static_values=static_values)
# Response will include: company_name = "Acme Corp" (from static_values)

# Works with all processing methods
result = client.processbase64(base64_string, template_id, static_values=static_values)
result = client.processjobfile("file.pdf", template_id, static_values=static_values)
result = client.processjobbase64(base64_string, template_id, static_values=static_values)

Async Job Processing

For long-running or batch processing tasks, use the async job workflow:

from docuprox import Docuprox

client = Docuprox(api_key="your-api-key-here")
template_id = "your-template-uuid-here"

# Step 1: Submit a job (returns immediately with job_id)
job_response = client.processjobfile("path/to/your/file.pdf", template_id)
job_id = job_response['job_id']
print(f"Job submitted: {job_id}")

# Or submit with base64 data
job_response = client.processjobbase64(base64_string, template_id)
job_id = job_response['job_id']

# Step 2: Check job status
status = client.getjobstatus(job_id)
print(f"Job status: {status}")

# Step 3: Retrieve results when job is complete (default: JSON format)
results = client.getjobresults(job_id)
print(results)

# Or get results in CSV format
results_csv = client.getjobresults(job_id, result_format="csv")
print(results_csv)

AI Agent Processing

Use the AI agent endpoint for intelligent document processing with custom prompts:

from docuprox import Docuprox

client = Docuprox(api_key="your-api-key-here")

# Define your prompt configuration (field: instruction mapping)
prompt_json = {
    "invoice_number": "Extract the invoice number",
    "date": "Extract the invoice date",
    "total_amount": "Extract the total amount",
    "vendor_name": "Extract the vendor name"
}

# Process a file with AI agent
result = client.processagentfile(
    file_path="path/to/invoice.pdf",
    prompt_json=prompt_json,
    document_type="invoice",
    custom_instructions="Focus on itemized line items",  # Optional
    static_values={"company_name": "Acme Corp"}  # Optional
)
print(result)

# Or process base64 data with AI agent
result = client.processagentbase64(
    base64_data=base64_string,
    prompt_json=prompt_json,
    document_type="invoice",
    custom_instructions="Extract vendor details",  # Optional
    static_values={"invoice_prefix": "INV-2024"}  # Optional
)
print(result)

API

Docuprox(api_url)

  • api_url: The base URL of the DocuProx API.

processfile(file_path, template_id, static_values=None)

Processes a file by reading it, encoding to base64, and sending to the /process endpoint with the specified template.

  • file_path: Path to the file to process.
  • template_id: UUID string of the template to use for processing.
  • static_values: Optional dictionary of static key-value pairs to include in processing.
  • Returns: JSON response from the API containing document data.
  • Raises: ValueError if file not found or API error.

processbase64(base64_data, template_id, static_values=None)

Processes a base64 encoded string by sending it to the /process endpoint with the specified template.

  • base64_data: Base64 encoded string of the image/document.
  • template_id: UUID string of the template to use for processing.
  • static_values: Optional dictionary of static key-value pairs to include in processing.
  • Returns: JSON response from the API containing document data.
  • Raises: ValueError if API error.

processjobfile(file_path, template_id, static_values=None)

Submits an async processing job by sending a file to the /process-job endpoint. Returns immediately with a job_id.

  • file_path: Path to the file to process.
  • template_id: UUID string of the template to use for processing.
  • static_values: Optional dictionary of static key-value pairs to include in processing.
  • Returns: JSON response from the API containing job_id and status.
  • Raises: ValueError if file not found or API error.

processjobbase64(base64_data, template_id, static_values=None)

Submits an async processing job with base64 encoded data to the /process-job endpoint. Returns immediately with a job_id.

  • base64_data: Base64 encoded string of the image/document.
  • template_id: UUID string of the template to use for processing.
  • static_values: Optional dictionary of static key-value pairs to include in processing.
  • Returns: JSON response from the API containing job_id and status.
  • Raises: ValueError if API error.

getjobstatus(job_id)

Checks the status of a processing job.

  • job_id: UUID string of the job to check.
  • Returns: JSON response from the API with job status information (e.g., pending, processing, completed, failed).
  • Raises: ValueError if job_id is invalid or API error.

getjobresults(job_id, result_format="json")

Retrieves the results of a completed processing job.

  • job_id: UUID string of the job to retrieve results for.
  • result_format: Format of results - 'json' or 'csv' (default: 'json').
  • Returns: JSON response from the API with job results in the specified format.
  • Raises: ValueError if job_id is invalid, format is invalid, or API error.

processagentfile(file_path, prompt_json, document_type, custom_instructions=None, static_values=None)

Processes a file using AI agent for intelligent document extraction.

  • file_path: Path to the file to process.
  • prompt_json: JSON object/dict containing the prompt configuration with fields and instructions.
  • document_type: Type of document being processed (e.g., "invoice", "receipt", "contract").
  • custom_instructions: Optional custom instructions for processing.
  • static_values: Optional dictionary of static key-value pairs to include in processing.
  • Returns: JSON response from the API with AI-extracted data.
  • Raises: ValueError if file not found or API error.

processagentbase64(base64_data, prompt_json, document_type, custom_instructions=None, static_values=None)

Processes base64 encoded data using AI agent for intelligent document extraction.

  • base64_data: Base64 encoded string of the image/document.
  • prompt_json: JSON object/dict containing the prompt configuration with fields and instructions.
  • document_type: Type of document being processed (e.g., "invoice", "receipt", "contract").
  • custom_instructions: Optional custom instructions for processing.
  • static_values: Optional dictionary of static key-value pairs to include in processing.
  • Returns: JSON response from the API with AI-extracted data.
  • Raises: ValueError if API error.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docuprox-0.1.2.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docuprox-0.1.2-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file docuprox-0.1.2.tar.gz.

File metadata

  • Download URL: docuprox-0.1.2.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for docuprox-0.1.2.tar.gz
Algorithm Hash digest
SHA256 b0820cd62bf80301e92fde30bd99e25c765767a64e74796cd34b4785a145c82d
MD5 be284d88df391b93cdc247027384ab2f
BLAKE2b-256 db1631e2252b116886de1a639879402927009db74d0f5cc975f62186f10458d9

See more details on using hashes here.

File details

Details for the file docuprox-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: docuprox-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for docuprox-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b87b9816c7cbe8d41e62c708aa035fd0e5ff13bb31b1b9fba219802dfdb2aef4
MD5 d298125a5f0986ed4d733cab858a52b4
BLAKE2b-256 eeaaabdec97af5b73042fd71c171f10c24277d862c73ca47702ce1482906d2f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page