Skip to main content

Python client for the Simba document processing API

Project description

Simba Client (formerly simba_sdk)

Python client for interacting with the Simba document processing API.

Installation

Using pip

pip install simba-client

Development Installation

For development purposes, you can install the package directly from the repository:

# Clone the repository
git clone https://github.com/yourusername/simba-client.git
cd simba-client

# Install using Poetry
poetry install

# Alternatively, install in development mode with pip
pip install -e .

Quick Start

from simba_sdk import SimbaClient

# Initialize the client
client = SimbaClient(
    api_url="https://api.simba.example.com",
    api_key="your-api-key"
)

# Upload a document
doc_result = client.document.create_from_file("path/to/your/document.pdf")
document_id = doc_result["id"]

# Parse the document
# Use synchronous parsing for immediate results
parse_result = client.parser.parse_document(document_id, sync=True)

# Or asynchronous parsing for background processing
async_result = client.parser.parse_document(document_id, sync=False)
task_id = async_result["task_id"]

# Check the status of an asynchronous task
task_status = client.parser.get_task_status(task_id)

# Extract tables from a document
tables = client.parser.extract_tables(document_id)

Features

  • Document Management (upload, retrieve, list, delete)
  • Document Parsing (synchronous and asynchronous)
  • Feature Extraction (tables, entities, forms, text)
  • Natural Language Querying

Documentation

For detailed documentation, please visit simba-client.readthedocs.io or refer to the docs directory in this repository.

API Reference

SimbaClient

The main client for interacting with the Simba API.

client = SimbaClient(
    api_url="https://api.simba.example.com",
    api_key="your-api-key",
    timeout=60
)

DocumentManager

Handles document operations (accessible via client.document).

  • create(file_path): Upload a document from a file path
  • create_from_file(file): Upload a document from a file object
  • get(document_id): Retrieve a document by ID
  • list(): List all documents
  • delete(document_id): Delete a document

ParserManager

Handles document parsing operations (accessible via client.parser).

  • parse_document(document_id, sync=True): Parse a document
  • extract_tables(document_id): Extract tables from a document
  • extract_entities(document_id): Extract entities from a document
  • extract_forms(document_id): Extract form fields from a document
  • extract_text(document_id): Extract text content from a document
  • parse_query(document_id, query): Extract information based on a natural language query

Development

Setup Development Environment

# Install Poetry if you don't have it
curl -sSL https://install.python-poetry.org | python3 -

# Install dependencies including development dependencies
poetry install

Running Tests

# Run all tests
poetry run pytest

# Run with coverage
poetry run pytest --cov=simba_sdk

# Run specific tests
poetry run pytest tests/test_document.py

Building Documentation

# Build the documentation
poetry run mkdocs build

# Serve the documentation locally
poetry run mkdocs serve

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simba_client-0.1.0.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

simba_client-0.1.0-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file simba_client-0.1.0.tar.gz.

File metadata

  • Download URL: simba_client-0.1.0.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.11.5 Darwin/24.3.0

File hashes

Hashes for simba_client-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1dd35cc6b79a1393909f4ef9e16ecefbcb996a68fcad2ee8a596bc716cfbba56
MD5 af40c71bb81e6fc562dfd07ccb20c1bd
BLAKE2b-256 0c2a9bffde75bfe097f66afb409ece8ee1fa2161b7a79aec665e452e719fc720

See more details on using hashes here.

File details

Details for the file simba_client-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: simba_client-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.11.5 Darwin/24.3.0

File hashes

Hashes for simba_client-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4f399998e07d7e1c04297b8beabb388c717e3fdaf79a9fcc8bc62c9f1f46aaf9
MD5 d042aa87c72da33e31b3ab6c2b73eb2f
BLAKE2b-256 ba6e7624fd95d9103ab88f5ec04a8f6be78893d443a1756ec3f23e8f8c4681c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page