Skip to main content

Python SDK for Splore

Project description

Splore Python SDK

The Splore Python SDK simplifies the process of interacting with the Splore document processing platform. Use it to upload files, process documents, and retrieve extracted data with minimal setup.


📌 Table of Contents


🚀 Features

  • Agent Management: Create, update, retrieve, and delete agents.
  • File Upload: Upload documents for processing.
  • Extractions: Extract structured data from documents.
  • Search: Perform web searches and retrieve search history.
  • AWS S3 Integration: Process files directly from S3.
  • Task Monitoring: Track the progress of extraction jobs.
  • Error Handling: Provides meaningful errors and retry mechanisms.
  • Python 3.7+ Compatibility: tested supported version after 3.7.17 can be used for python 3.7 and above.

📥 Installation

Install the SDK via pip:

pip install splore-sdk

For optional example dependencies:

pip install splore-sdk[examples]

🏁 Getting Started

Prerequisites

  1. API Key and Base ID: Obtain these from the Splore console.
  2. Python 3.7+: Ensure Python is installed.

Quick Start Example

from splore_sdk import SploreSDK

# Initialize SDK
sdk = SploreSDK(api_key="YOUR_API_KEY", base_id="YOUR_BASE_ID")

# Initialize Agent for extraction
extraction_agent = sdk.init_agent(agent_id="YOUR_AGENT_ID")

# Basic extraction flow
extracted_response = extraction_agent.extract(file_path="absolute_file_path")
print(extracted_response)

📦 Modules Overview

🔹 Agent Management

Manage agents for document processing.

Example Usage

from splore_sdk import SploreSDK

# Initialize SDK
sdk = SploreSDK(api_key="YOUR_API_KEY", base_id="YOUR_BASE_ID")

# Create an agent
agent_payload = {"name": "Test Agent", "config": {"key": "value"}}
create_response = sdk.create_agent(agent_payload)
print("Create Agent Response:", create_response)

# Get agent details
agent_id = create_response.get("id")
get_response = sdk.agents.get_agents(agentId=agent_id)
print("Get Agent Response:", get_response)

# Get all agents
all_agents = sdk.agents.get_agents()
print("All Agents:", all_agents)

# Update the agent
update_payload = {"name": "Updated Agent Name"}
update_response = sdk.agents.update_agent(agent_payload=update_payload)
print("Update Agent Response:", update_response)

# Delete the agent
delete_response = sdk.agents.delete_agents(agentId=agent_id)
print("Delete Agent Response:", delete_response)

🔹 Extractions

Handle document processing and extraction.

Example Usage

from splore_sdk import SploreSDK
from time import sleep

# Initialize SDK
sdk = SploreSDK(api_key="YOUR_API_KEY", base_id="YOUR_BASE_ID")

# Get all agents
agents = sdk.agents.get_agents()
agent_id = agents[0]["id"]  # Adjust as needed

# Initialize agent
extraction_agent = sdk.init_agent(agent_id=agent_id)

# Upload file
upload_response = extraction_agent.file_uploader.upload_file(file_path="path/to/file.pdf")
file_id = upload_response
print("File uploaded with ID:", file_id)

# Start extraction
extraction_agent.extractions.start(file_id=file_id)

# Monitor extraction status
while True:
    status = extraction_agent.extractions.processing_status(file_id=file_id)
    if status.get("fileProcessingStatus") == "COMPLETED":
        break
    sleep(10)  # Wait before checking again

# Retrieve extracted data
extracted_data = extraction_agent.extractions.extracted_response(file_id=file_id)
print("Extracted Data:", extracted_data)

🔹 Search

Perform web searches and manage search history.

Example Usage

from splore_sdk import SploreSDK

# Initialize SDK
sdk = SploreSDK(api_key="YOUR_API_KEY", base_id="YOUR_BASE_ID")

# Initialize agent
agent_id = "YOUR_AGENT_ID"
search_agent = sdk.init_agent(agent_id=agent_id)

# Perform a search
search_results = search_agent.search.search(query="artificial intelligence", count=5, engine="google")
print("Search Results:", search_results)

# Get search history
history = search_agent.search.get_history(page=0, size=10)
print("Search History:", history)

🔹 File Upload

Upload files to Splore for processing.

Example Usage

from splore_sdk import SploreSDK

# Initialize SDK
sdk = SploreSDK(api_key="YOUR_API_KEY", base_id="YOUR_BASE_ID")

# Upload file with metadata
metadata = {
    "file_name": "document.pdf",
    "custom_extraction": "false",
    "is_data_file": "true"
}

with open("path/to/file.pdf", "rb") as file:
    response = sdk.file_uploader.upload_file(file_stream=file, metadata=metadata)
    print("Upload Response:", response)

🔹 AWS Integration

Download files from AWS S3 for extraction.

Example Usage

from splore_sdk import SploreSDK
from examples.aws import download_from_s3

# Initialize SDK
sdk = SploreSDK(api_key="YOUR_API_KEY", base_id="YOUR_BASE_ID")

# Initialize extraction agent
extraction_agent = sdk.init_agent(agent_id="YOUR_AGENT_ID")

# Create a temporary file destination
file_ref = sdk.file_uploader.create_temp_file_destination(file_extension=".pdf")
s3_uri = "s3://abc/def/abc.pdf"

# Download file from S3
download_from_s3(s3_uri, file_ref)

# Start extraction
response = extraction_agent.extract(file_path=file_ref)
print("Extraction Response:", response)

⚙️ Advanced Usage

🔸 Polling Interval Configuration

Customize the polling interval for extraction status checks.

while True:
    status = sdk.extractions.processing_status(file_id=file_id)
    if status.get("fileProcessingStatus") == "COMPLETED":
        break
    sleep(5)  # Set custom polling interval

🔸 Error Handling

Handle errors gracefully for better debugging.

try:
    sdk.extractions.upload_file("path/to/file.pdf")
except Exception as e:
    print("Error uploading file:", str(e))

🔸 Python 3.7 Compatibility

The SDK now supports Python 3.7 and above.


❓ FAQ

1️⃣ How do I get an API Key?

Sign up on the Splore console and navigate to the API section to generate a key.

2️⃣ Can I use this SDK asynchronously?

Asynchronous support will be added in a future release.

3️⃣ Which file formats are supported?

Currently, only PDF files are supported.

4️⃣ How do I handle search functionality?

The SDK provides a dedicated search capability that allows you to perform web searches and manage search history. Use the search.search() method to perform searches and search.get_history() to retrieve search history.

5️⃣ How do I check the SDK version?

from splore_sdk import __version__
print("Splore SDK Version:", __version__)

🔗 Support

For any questions or issues, please:


📜 License

This SDK is licensed under the MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

splore_sdk-0.1.20.tar.gz (25.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

splore_sdk-0.1.20-py3-none-any.whl (31.7 kB view details)

Uploaded Python 3

File details

Details for the file splore_sdk-0.1.20.tar.gz.

File metadata

  • Download URL: splore_sdk-0.1.20.tar.gz
  • Upload date:
  • Size: 25.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for splore_sdk-0.1.20.tar.gz
Algorithm Hash digest
SHA256 9eb50ec16532e46e87293201469f9007e6d864da3c03e583af4c0f1d309d3e1c
MD5 91e6e9462a8398714bfaff4b6e42bd63
BLAKE2b-256 4a259cff8b5ae12aff91aadb5974f9844acb286ffe28e3baba121d160678333f

See more details on using hashes here.

File details

Details for the file splore_sdk-0.1.20-py3-none-any.whl.

File metadata

  • Download URL: splore_sdk-0.1.20-py3-none-any.whl
  • Upload date:
  • Size: 31.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for splore_sdk-0.1.20-py3-none-any.whl
Algorithm Hash digest
SHA256 8470232f6cb113a21388e67a1bdb930a164701c7a47fcf1015beb9220631b996
MD5 7ca1013331e8e156f58ca2b93b9b1bed
BLAKE2b-256 967e28b33153c3944e65353b9bac9c9b5f1a918ebc6939ac5c19c3357013e96b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page