Skip to main content

Tensorlake SDK for Document Ingestion API and Serverless Applications

Project description

Group 39884

Get high quality data from Documents fast, and deploy scalable serverless Data Processor APIs

PyPI Version Python Support License Documentation Slack

Tensorlake is the platform for agentic applications. Build and deploy high throughput, durable, agentic applications and workflows in minutes, leveraging our best-in-class Document Ingestion API and compute platform for applications.

Animation showing the Tensorlake Document Ingestion UI parsing an ACORD doc into Markdown

Features

  • Document Ingestion - Parse documents (PDFs, DOCX, spreadsheets, presentations, images, and raw text) to markdown or extract structured data with schemas. This is powered by Tensorlake's state of the art layout detection and table recognition models. Review our benchmarks here.

  • Agentic Applications - Deploy Agentic Applications and AI Workflows using durable functions, with sandboxed and managed compute infrastructure that scales your agents with usage.


Document Ingestion Quickstart

Installation

Install the SDK and get an API Key.

pip install tensorlake

Sign up at cloud.tensorlake.ai and get your API key.

Parse Documents

from tensorlake.documentai import DocumentAI, ParseStatus

doc_ai = DocumentAI(api_key="your-api-key")

# Upload and parse document
file_id = doc_ai.upload("/path/to/document.pdf")

# Get parse ID
parse_id = doc_ai.parse(file_id)

# Wait for completion and get results
result = doc_ai.result(parse_id)

if result.status == ParseStatus.SUCCESSFUL:
    for chunk in result.chunks:
        print(chunk.content)  # Clean markdown output

Customize Parsing

Configure chunking, table output, figure summarization, and more. See all options.

from tensorlake.documentai import DocumentAI, ParsingOptions, EnrichmentOptions, ChunkingStrategy, TableOutputMode

doc_ai = DocumentAI(api_key="your-api-key")
file_id = doc_ai.upload("/path/to/document.pdf")

result = doc_ai.parse_and_wait(
    file_id,
    parsing_options=ParsingOptions(
        chunking_strategy=ChunkingStrategy.SECTION,
        table_output_mode=TableOutputMode.HTML,
        signature_detection=True
    ),
    enrichment_options=EnrichmentOptions(
        figure_summarization=True,
        table_summarization=True
    )
)

Structured Extraction

Extract specific data fields using Pydantic models or JSON schemas. See docs.

from tensorlake.documentai import DocumentAI, StructuredExtractionOptions
from pydantic import BaseModel, Field

class InvoiceData(BaseModel):
    invoice_number: str = Field(description="Invoice number")
    total_amount: float = Field(description="Total amount due")
    due_date: str = Field(description="Payment due date")
    vendor_name: str = Field(description="Vendor company name")

doc_ai = DocumentAI(api_key="your-api-key")

result = doc_ai.parse_and_wait(
    "https://example.com/invoice.pdf",  # Or use file_id from upload()
    structured_extraction_options=[StructuredExtractionOptions(
        schema_name="Invoice Data",
        json_schema=InvoiceData
    )]
)
print(result.structured_data)

Learn More

Build Durable Agentic Applications in Python

Deploy agentic applications on a distributed runtime with automatic scaling and durable execution — applications restart from where they crashed automatically. You can build with any Python framework. Agents are exposed as HTTP APIs like web applications.

  • No Queues: We manage state and orchestration
  • Zero Infra: Write Python, deploy to Tensorlake
  • Progress Updates: Applications can run for any amount of time and stream updates to users.

Quickstart

Decorate your entrypoint with @application() and functions with @function() for checkpointing and sandboxed execution. Each function runs in its own isolated sandbox.

Example: City guide using OpenAI Agents with web search and code execution:

from agents import Agent, Runner
from agents.tool import WebSearchTool, function_tool
from tensorlake.applications import application, function, Image

# Define the image with necessary dependencies
FUNCTION_CONTAINER_IMAGE = Image(base_image="python:3.11-slim", name="city_guide_image").run(
    "pip install openai openai-agents"
)

@function_tool
@function(
    description="Gets the weather for a city using an OpenAI Agent with web search",
    secrets=["OPENAI_API_KEY"],
    image=FUNCTION_CONTAINER_IMAGE,
)
def get_weather_tool(city: str) -> str:
    """Uses an OpenAI Agent with WebSearchTool to find current weather."""
    agent = Agent(
        name="Weather Reporter",
        instructions="Use web search to find current weather in Fahrenheit for the city.",
        tools=[WebSearchTool()],  # Agent can search the web
    )
    result = Runner.run_sync(agent, f"City: {city}")
    return result.final_output.strip()

@application(tags={"type": "example", "use_case": "city_guide"})
@function(
    description="Creates a guide with temperature conversion using function_tool",
    secrets=["OPENAI_API_KEY"],
    image=FUNCTION_CONTAINER_IMAGE,
)
def city_guide_app(city: str) -> str:
    """Uses an OpenAI Agent with function_tool to run Python code for conversion."""
    
    @function_tool
    def convert_to_celsius_tool(python_code: str) -> float:
        """Converts Fahrenheit to Celsius - runs as Python code via Agent."""
        return float(eval(python_code))
    
    agent = Agent(
        name="Guide Creator",
        instructions="Using the appropriate tools, get the weather for the purposes of the guide. If the city uses Celsius, call convert_to_celsius_tool to convert the temperature, passing in the code needed to convert the temperature to Celsius. Create a friendly guide that references the temperature of the city in Celsius if the city typically uses Celsius, otherwise reference the temperature in Fahrenheit. Only reference Celsius or Farenheit, not both.",
        tools=[get_weather_tool, convert_to_celsius_tool],  # Agent can execute this Python function
    )
    result = Runner.run_sync(agent, f"City: {city}")
    return result.final_output.strip()

Note: This is a simplified version. See the complete example at examples/readme_example/city_guide.py for the full implementation including activity suggestions and agent orchestration.

Deploy to Tensorlake Cloud

  1. Set your API keys:
export TENSORLAKE_API_KEY="your-api-key"
tensorlake secrets set OPENAI_API_KEY "your-openai-key"
  1. Deploy:
tensorlake deploy examples/readme_example/city_guide.py

Call via HTTP

# Invoke the application
curl https://api.tensorlake.ai/applications/city_guide_app \
  -H "Authorization: Bearer $TENSORLAKE_API_KEY" \
  --json '"San Francisco"'
# Returns: {"request_id": "beae8736ece31ef9"}

# Get the result
curl https://api.tensorlake.ai/applications/city_guide_app/requests/{request_id}/output \
  -H "Authorization: Bearer $TENSORLAKE_API_KEY"

# Stream results with SSE
curl https://api.tensorlake.ai/applications/city_guide_app \
  -H "Authorization: Bearer $TENSORLAKE_API_KEY" \
  -H "Accept: text/event-stream" \
  --json '"San Francisco"'

# Send files
curl https://api.tensorlake.ai/applications/my_pdf_processor \
  -H "Authorization: Bearer $TENSORLAKE_API_KEY" \
  -H "Content-Type: application/pdf" \
  --data-binary @document.pdf

Learn More

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tensorlake-0.4.16.tar.gz (2.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

tensorlake-0.4.16-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.7 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

tensorlake-0.4.16-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (10.3 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

tensorlake-0.4.16-py3-none-macosx_11_0_arm64.whl (9.9 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

tensorlake-0.4.16-py3-none-macosx_10_12_x86_64.whl (10.3 MB view details)

Uploaded Python 3macOS 10.12+ x86-64

File details

Details for the file tensorlake-0.4.16.tar.gz.

File metadata

  • Download URL: tensorlake-0.4.16.tar.gz
  • Upload date:
  • Size: 2.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tensorlake-0.4.16.tar.gz
Algorithm Hash digest
SHA256 4adf50eff607e185422959240a355ffd2a816c29500435b5a5b20b5fbe20f560
MD5 e854f2f53417bf8c2c402a37de2fe2e3
BLAKE2b-256 4b6cb5b94ef228df8e1f0497530ef5e32883086174a0a802603dc21d60c0a1f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for tensorlake-0.4.16.tar.gz:

Publisher: publish_pypi.yaml on tensorlakeai/tensorlake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tensorlake-0.4.16-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tensorlake-0.4.16-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d00922de21a0f67a154ee68b9dba8be54f0af38112a11ec56767f4975c1c0d32
MD5 bc21ba87ea0d5cb8177a75ef80aabf2e
BLAKE2b-256 23358d153e904b7aca10805d77eec85c65d6ab14d0bd452cd843715ece7a8375

See more details on using hashes here.

Provenance

The following attestation bundles were made for tensorlake-0.4.16-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish_pypi.yaml on tensorlakeai/tensorlake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tensorlake-0.4.16-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tensorlake-0.4.16-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a9bb2b3a81c35544b7d6d1a6fb32e81d83d4785179027eb515144eae07b377be
MD5 1f721759d980ec0c3b29ec965dceebc7
BLAKE2b-256 f20bb8c90b06b4a4fcf9da4d234e4418428f968b4a9fe95d9637961201b8ad76

See more details on using hashes here.

Provenance

The following attestation bundles were made for tensorlake-0.4.16-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: publish_pypi.yaml on tensorlakeai/tensorlake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tensorlake-0.4.16-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tensorlake-0.4.16-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 787e2c3e65b5f2cda808faca3090cd9af5f39ef2c61d930008c14c561a5cd523
MD5 578fcbb0a38ad42e430eff8361ddce2e
BLAKE2b-256 2da04b0fa482fd6004b0da2a48692d8c845d12169d40ef6420a12740245c0206

See more details on using hashes here.

Provenance

The following attestation bundles were made for tensorlake-0.4.16-py3-none-macosx_11_0_arm64.whl:

Publisher: publish_pypi.yaml on tensorlakeai/tensorlake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tensorlake-0.4.16-py3-none-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for tensorlake-0.4.16-py3-none-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 c9ac4163d2461a7a6cbba3d5e6e5f6be4ece52696e3b7d39cde363e4bd7ccf4c
MD5 147d1fe0a6f6f5bc4dd0f449b8007efa
BLAKE2b-256 228d3dbc7da590203dd2a82703669bdda0f4fc3a1fb2b8ae79ec5451a1941d25

See more details on using hashes here.

Provenance

The following attestation bundles were made for tensorlake-0.4.16-py3-none-macosx_10_12_x86_64.whl:

Publisher: publish_pypi.yaml on tensorlakeai/tensorlake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page