Python client for the Simba document processing API
Project description
Simba Client (formerly simba_sdk)
Python client for interacting with the Simba document processing API.
Installation
Using pip
pip install simba-client
Development Installation
For development purposes, you can install the package directly from the repository:
# Clone the repository
git clone https://github.com/yourusername/simba-client.git
cd simba-client
# Install using Poetry
poetry install
# Alternatively, install in development mode with pip
pip install -e .
Quick Start
from simba_sdk import SimbaClient
# Initialize the client
client = SimbaClient(
api_url="https://api.simba.example.com",
api_key="your-api-key"
)
# Upload a document
doc_result = client.document.create_from_file("path/to/your/document.pdf")
document_id = doc_result["id"]
# Parse the document
# Use synchronous parsing for immediate results
parse_result = client.parser.parse_document(document_id, sync=True)
# Or asynchronous parsing for background processing
async_result = client.parser.parse_document(document_id, sync=False)
task_id = async_result["task_id"]
# Check the status of an asynchronous task
task_status = client.parser.get_task_status(task_id)
# Extract tables from a document
tables = client.parser.extract_tables(document_id)
Features
- Document Management (upload, retrieve, list, delete)
- Document Parsing (synchronous and asynchronous)
- Feature Extraction (tables, entities, forms, text)
- Natural Language Querying
Documentation
For detailed documentation, please visit simba-client.readthedocs.io or refer to the docs directory in this repository.
API Reference
SimbaClient
The main client for interacting with the Simba API.
client = SimbaClient(
api_url="https://api.simba.example.com",
api_key="your-api-key",
timeout=60
)
DocumentManager
Handles document operations (accessible via client.document).
create(file_path): Upload a document from a file pathcreate_from_file(file): Upload a document from a file objectget(document_id): Retrieve a document by IDlist(): List all documentsdelete(document_id): Delete a document
ParserManager
Handles document parsing operations (accessible via client.parser).
parse_document(document_id, sync=True): Parse a documentextract_tables(document_id): Extract tables from a documentextract_entities(document_id): Extract entities from a documentextract_forms(document_id): Extract form fields from a documentextract_text(document_id): Extract text content from a documentparse_query(document_id, query): Extract information based on a natural language query
Development
Setup Development Environment
# Install Poetry if you don't have it
curl -sSL https://install.python-poetry.org | python3 -
# Install dependencies including development dependencies
poetry install
Running Tests
# Run all tests
poetry run pytest
# Run with coverage
poetry run pytest --cov=simba_sdk
# Run specific tests
poetry run pytest tests/test_document.py
Building Documentation
# Build the documentation
poetry run mkdocs build
# Serve the documentation locally
poetry run mkdocs serve
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file simba_client-0.1.0.tar.gz.
File metadata
- Download URL: simba_client-0.1.0.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.11.5 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1dd35cc6b79a1393909f4ef9e16ecefbcb996a68fcad2ee8a596bc716cfbba56
|
|
| MD5 |
af40c71bb81e6fc562dfd07ccb20c1bd
|
|
| BLAKE2b-256 |
0c2a9bffde75bfe097f66afb409ece8ee1fa2161b7a79aec665e452e719fc720
|
File details
Details for the file simba_client-0.1.0-py3-none-any.whl.
File metadata
- Download URL: simba_client-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.11.5 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f399998e07d7e1c04297b8beabb388c717e3fdaf79a9fcc8bc62c9f1f46aaf9
|
|
| MD5 |
d042aa87c72da33e31b3ab6c2b73eb2f
|
|
| BLAKE2b-256 |
ba6e7624fd95d9103ab88f5ec04a8f6be78893d443a1756ec3f23e8f8c4681c4
|