inference.sh Python SDK

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

inferencesh — python sdk for ai inference api

official python sdk for inference.sh — the ai agent runtime for serverless ai inference.

run ai models, build ai agents, and deploy generative ai applications. access 150+ models including flux, stable diffusion, llms (claude, gpt, gemini), video generation (veo, seedance), and more.

installation

pip install inferencesh

client usage

from inferencesh import inference, TaskStatus

# Create client
client = inference(api_key="your-api-key")

# Simple synchronous usage - waits for completion by default
result = client.tasks.run({
    "app": "your-app",
    "input": {"key": "value"},
    "infra": "cloud",
    "variant": "default"
})

print(f"Task ID: {result.get('id')}")
print(f"Output: {result.get('output')}")

with setup parameters

Setup parameters configure the app instance (e.g., model selection). Workers with matching setup are "warm" and skip the setup phase:

result = client.tasks.run({
    "app": "your-app",
    "setup": {"model": "schnell"},  # Setup parameters
    "input": {"prompt": "hello"}
})

run options

# Wait for completion (default behavior)
result = client.tasks.run(params)  # wait=True is default

# Return immediately without waiting
task = client.tasks.run(params, wait=False)
task_id = task["id"]  # Use this to check status later

# Stream updates as they happen
for update in client.tasks.run(params, stream=True):
    print(f"Status: {TaskStatus(update['status']).name}")
    if update.get("status") == TaskStatus.COMPLETED:
        print(f"Output: {update.get('output')}")

task management

# Get current task state
task = client.tasks.get(task_id)
print(f"Status: {TaskStatus(task['status']).name}")

# Cancel a running task
client.tasks.cancel(task_id)

# Wait for a task to complete
result = client.tasks.wait_for_completion(task_id)

# Stream updates for an existing task
with client.tasks.stream(task_id) as stream:
    for update in stream:
        print(f"Status: {TaskStatus(update['status']).name}")
        if update.get("status") == TaskStatus.COMPLETED:
            print(f"Result: {update.get('output')}")
            break

# Access final result after streaming
print(f"Final result: {stream.result}")

task status values

from inferencesh import TaskStatus

TaskStatus.RECEIVED    # 1 - Task received by server
TaskStatus.QUEUED      # 2 - Task queued for processing
TaskStatus.SCHEDULED   # 3 - Task scheduled to a worker
TaskStatus.PREPARING   # 4 - Worker preparing environment
TaskStatus.SERVING     # 5 - Model being loaded
TaskStatus.SETTING_UP  # 6 - Task setup in progress
TaskStatus.RUNNING     # 7 - Task actively running
TaskStatus.UPLOADING   # 8 - Uploading results
TaskStatus.COMPLETED   # 9 - Task completed successfully
TaskStatus.FAILED      # 10 - Task failed
TaskStatus.CANCELLED   # 11 - Task was cancelled

sessions (stateful execution)

Sessions allow you to maintain state across multiple task invocations. The worker stays warm between calls, preserving loaded models and in-memory state.

# Start a new session
result = client.tasks.run({
    "app": "my-stateful-app",
    "input": {"prompt": "hello"},
    "session": "new"
})

session_id = result.get("session_id")
print(f"Session ID: {session_id}")

# Continue the session with another call
result2 = client.tasks.run({
    "app": "my-stateful-app",
    "input": {"prompt": "remember what I said?"},
    "session": session_id
})

custom session timeout

By default, sessions expire after 60 seconds of inactivity. You can customize this with session_timeout (1-3600 seconds):

# Create a session with 5-minute idle timeout
result = client.tasks.run({
    "app": "my-stateful-app",
    "input": {"prompt": "hello"},
    "session": "new",
    "session_timeout": 300  # 5 minutes
})

# Session stays alive for 5 minutes after each call

Notes:

session_timeout is only valid when session: "new"
Minimum timeout: 1 second
Maximum timeout: 3600 seconds (1 hour)
Each successful call resets the idle timer

For complete session documentation including error handling, best practices, and advanced patterns, see the Sessions Developer Guide.

file upload

from inferencesh import UploadFileOptions

# Upload from file path
file_obj = client.files.upload("/path/to/image.png")
print(f"URI: {file_obj['uri']}")

# Upload from bytes
file_obj = client.files.upload(
    b"raw bytes data",
    UploadFileOptions(
        filename="data.bin",
        content_type="application/octet-stream"
    )
)

# Upload with options
file_obj = client.files.upload(
    "/path/to/image.png",
    UploadFileOptions(
        filename="custom_name.png",
        content_type="image/png",
        public=True  # Make publicly accessible
    )
)

Note: Files in task input are automatically uploaded. You only need files.upload() for manual uploads.

agent chat

Chat with AI agents using client.agents.create().

using a template agent

Use an existing agent from your workspace by its namespace/name@shortid:

from inferencesh import inference

client = inference(api_key="your-api-key")

# Create agent from template
agent = client.agents.create("my-org/assistant@abc123")

# Send a message with streaming
def on_message(msg):
    content = msg.get("content", [])
    for c in content:
        if c.get("type") == "text" and c.get("text"):
            print(c["text"], end="", flush=True)

response = agent.send_message("Hello!", on_message=on_message)
print(f"\nChat ID: {agent.chat_id}")

creating an ad-hoc agent

Create agents on-the-fly without saving to your workspace:

from inferencesh import inference, AdHocAgentOptions
from inferencesh import tool, string

client = inference(api_key="your-api-key")

# Define a client tool
weather_tool = (
    tool("get_weather")
    .description("Get current weather")
    .params({"city": string("City name")})
    .handler(lambda args: '{"temp": 72, "conditions": "sunny"}')
    .build()
)

# Create ad-hoc agent
agent = client.agents.create(AdHocAgentOptions(
    core_app="infsh/claude-sonnet-4@abc123",  # LLM to use
    system_prompt="You are a helpful assistant.",
    tools=[weather_tool]
))

def on_tool_call(call):
    print(f"[Tool: {call.name}]")
    # Tools with handlers are auto-executed

response = agent.send_message(
    "What's the weather in Paris?",
    on_message=on_message,
    on_tool_call=on_tool_call
)

agent methods

Method	Description
`send_message(text, ...)`	Send a message to the agent
`get_chat(chat_id=None)`	Get chat history
`stop_chat(chat_id=None)`	Stop current generation
`submit_tool_result(tool_id, result_or_action)`	Submit result for a client tool (string or {action, form_data})
`stream_messages(chat_id=None, ...)`	Stream message updates
`stream_chat(chat_id=None, ...)`	Stream chat updates
`reset()`	Start a new conversation

async agent

from inferencesh import async_inference

client = async_inference(api_key="your-api-key")
agent = client.agents.create("my-org/assistant@abc123")

response = await agent.send_message("Hello!")

async client

from inferencesh import async_inference, TaskStatus

async def main():
    client = async_inference(api_key="your-api-key")

    # Simple usage - wait for completion
    result = await client.tasks.run({
        "app": "your-app",
        "input": {"key": "value"},
        "infra": "cloud",
        "variant": "default"
    })
    print(f"Output: {result.get('output')}")

    # Return immediately without waiting
    task = await client.tasks.run(params, wait=False)

    # Stream updates
    async for update in await client.tasks.run(params, stream=True):
        print(f"Status: {TaskStatus(update['status']).name}")
        if update.get("status") == TaskStatus.COMPLETED:
            print(f"Output: {update.get('output')}")

    # Task management
    task = await client.tasks.get(task_id)
    await client.tasks.cancel(task_id)
    result = await client.tasks.wait_for_completion(task_id)

    # Stream existing task
    async with client.tasks.stream(task_id) as stream:
        async for update in stream:
            print(f"Update: {update}")

file handling

the File class provides a standardized way to handle files in the inference.sh ecosystem:

from infsh import File

# Basic file creation
file = File(path="/path/to/file.png")

# File with explicit metadata
file = File(
    path="/path/to/file.png",
    content_type="image/png",
    filename="custom_name.png",
    size=1024  # in bytes
)

# Create from path (automatically populates metadata)
file = File.from_path("/path/to/file.png")

# Check if file exists
exists = file.exists()

# Access file metadata
print(file.content_type)  # automatically detected if not specified
print(file.size)       # file size in bytes
print(file.filename)   # basename of the file

# Refresh metadata (useful if file has changed)
file.refresh_metadata()

the File class automatically handles:

mime type detection
file size calculation
filename extraction from path
file existence checking

creating an app

to create an inference app, inherit from BaseApp and define your input/output types:

from infsh import BaseApp, BaseAppInput, BaseAppOutput, File

class AppInput(BaseAppInput):
    image: str  # URL or file path to image
    mask: str   # URL or file path to mask

class AppOutput(BaseAppOutput):
    image: File

class MyApp(BaseApp):
    async def setup(self):
        # Initialize your model here
        pass

    async def run(self, app_input: AppInput) -> AppOutput:
        # Process input and return output
        result_path = "/tmp/result.png"
        return AppOutput(image=File(path=result_path))

    async def unload(self):
        # Clean up resources
        pass

app lifecycle has three main methods:

setup(): called when the app starts, use it to initialize models
run(): called for each inference request
unload(): called when shutting down, use it to free resources

resources

documentation — getting started guides and api reference
blog — tutorials on ai agents, image generation, and more
app store — browse 150+ ai models
discord — community support
github — open source projects

license

MIT © inference.sh

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.6.28

Apr 2, 2026

0.6.27

Mar 31, 2026

0.6.26

Mar 19, 2026

0.6.25

Mar 12, 2026

0.6.24

Mar 11, 2026

0.6.23

Mar 10, 2026

0.6.22

Mar 9, 2026

0.6.21

Mar 9, 2026

0.6.20

Mar 4, 2026

0.6.19

Feb 25, 2026

0.6.18

Feb 18, 2026

0.6.17

Feb 18, 2026

0.6.16

Feb 18, 2026

0.6.15

Feb 17, 2026

0.6.14

Feb 17, 2026

0.6.13

Feb 10, 2026

0.6.12

Feb 9, 2026

0.6.10

Feb 3, 2026

0.6.9

Jan 29, 2026

0.6.8

Jan 25, 2026

0.6.7

Jan 23, 2026

0.6.6

Jan 23, 2026

0.6.5

Jan 23, 2026

0.6.4

Jan 21, 2026

0.6.3

Jan 21, 2026

0.6.2

Jan 20, 2026

0.6.1

Jan 11, 2026

0.6.0

Jan 9, 2026

0.5.6

Jan 6, 2026

0.5.5

Jan 3, 2026

0.5.4

Dec 29, 2025

0.5.3

Dec 26, 2025

0.5.2

Dec 26, 2025

0.5.1

Dec 6, 2025

0.5.0

Nov 30, 2025

0.4.29

Nov 18, 2025

0.4.26

Nov 2, 2025

0.4.25

Nov 2, 2025

0.4.24

Oct 10, 2025

0.4.23

Oct 10, 2025

0.4.22

Oct 9, 2025

0.4.20

Oct 3, 2025

0.4.19

Sep 29, 2025

0.4.18

Sep 28, 2025

0.4.17

Sep 26, 2025

0.4.16

Sep 22, 2025

0.4.15

Sep 22, 2025

0.4.14

Sep 16, 2025

0.4.13

Sep 16, 2025

0.4.12

Sep 16, 2025

0.4.11

Sep 16, 2025

0.4.10

Sep 16, 2025

0.4.9

Sep 16, 2025

0.4.8

Sep 16, 2025

0.4.7

Sep 16, 2025

0.4.6

Sep 15, 2025

0.4.4

Sep 15, 2025

0.4.3

Sep 12, 2025

0.4.2

Aug 28, 2025

0.4.1

Aug 28, 2025

0.4.0

Aug 27, 2025

0.3.1

Aug 8, 2025

0.2.37

Jul 5, 2025

0.2.36

Jul 5, 2025

0.2.35

Jul 5, 2025

0.2.34

Jul 4, 2025

0.2.33

Jul 1, 2025

0.2.32

Jul 1, 2025

0.2.31

Jun 29, 2025

0.2.30

Jun 28, 2025

0.2.29

Jun 28, 2025

0.2.28

Jun 28, 2025

0.2.27

Jun 28, 2025

0.2.26

Jun 28, 2025

0.2.25

Jun 27, 2025

0.2.24

Jun 26, 2025

0.2.23

Jun 26, 2025

0.2.22

Jun 26, 2025

0.2.21

Jun 26, 2025

0.2.20

Jun 26, 2025

0.2.19

Jun 26, 2025

0.2.18

Jun 26, 2025

0.2.17

Jun 26, 2025

0.2.16

Jun 26, 2025

0.2.15

Jun 26, 2025

0.2.14

Jun 25, 2025

0.2.13

Jun 23, 2025

0.2.12

Jun 23, 2025

0.2.10

Jun 23, 2025

0.2.9

Jun 23, 2025

0.2.7

May 13, 2025

0.2.6

May 12, 2025

0.2.5

May 12, 2025

0.2.4

May 9, 2025

0.2.3

May 9, 2025

0.2.1

May 6, 2025

0.2.0

May 6, 2025

0.1.24

May 6, 2025

0.1.23

Apr 29, 2025

0.1.22

Apr 26, 2025

0.1.19

Mar 10, 2025

0.1.18

Mar 10, 2025

0.1.17

Mar 10, 2025

0.1.16

Mar 10, 2025

0.1.15

Mar 10, 2025

0.1.14

Mar 10, 2025

0.1.12

Feb 8, 2025

0.1.11

Jan 31, 2025

0.1.10

Jan 31, 2025

0.1.9

Jan 31, 2025

0.1.8

Jan 31, 2025

0.1.7

Jan 31, 2025

0.1.6

Jan 30, 2025

0.1.5

Jan 30, 2025

0.1.4

Jan 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inferencesh-0.6.28.tar.gz (79.2 kB view details)

Uploaded Apr 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

inferencesh-0.6.28-py3-none-any.whl (68.9 kB view details)

Uploaded Apr 2, 2026 Python 3

File details

Details for the file inferencesh-0.6.28.tar.gz.

File metadata

Download URL: inferencesh-0.6.28.tar.gz
Upload date: Apr 2, 2026
Size: 79.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for inferencesh-0.6.28.tar.gz
Algorithm	Hash digest
SHA256	`21a4e2f3222552b12848b61be25a2865c59ea7222627c92287923589ff23cb2d`
MD5	`86907230800ffac6e40b3741b9f3f392`
BLAKE2b-256	`a88b80926ce3a027eb51afa511cd56cf521e87c32b35b4b3748fcb8414fc4767`

See more details on using hashes here.

Provenance

The following attestation bundles were made for inferencesh-0.6.28.tar.gz:

Publisher: publish.yml on inference-sh/sdk-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: inferencesh-0.6.28.tar.gz
- Subject digest: 21a4e2f3222552b12848b61be25a2865c59ea7222627c92287923589ff23cb2d
- Sigstore transparency entry: 1214703028
- Sigstore integration time: Apr 2, 2026
Source repository:
- Permalink: inference-sh/sdk-py@683e827170611bcc512c6a594f124da1731197db
- Branch / Tag: refs/tags/v0.6.28
- Owner: https://github.com/inference-sh
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@683e827170611bcc512c6a594f124da1731197db
- Trigger Event: release

File details

Details for the file inferencesh-0.6.28-py3-none-any.whl.

File metadata

Download URL: inferencesh-0.6.28-py3-none-any.whl
Upload date: Apr 2, 2026
Size: 68.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for inferencesh-0.6.28-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2b2688f690020a1c1173c2c43f651772831a0925bdf18f910cbd5821ff7f0dbb`
MD5	`444544db36e25a5a99c251c7c107601a`
BLAKE2b-256	`a477d26e80d609b47643e2f9231975b2a7de0d421bb39d429a27762a958bbcd2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for inferencesh-0.6.28-py3-none-any.whl:

Publisher: publish.yml on inference-sh/sdk-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: inferencesh-0.6.28-py3-none-any.whl
- Subject digest: 2b2688f690020a1c1173c2c43f651772831a0925bdf18f910cbd5821ff7f0dbb
- Sigstore transparency entry: 1214703099
- Sigstore integration time: Apr 2, 2026
Source repository:
- Permalink: inference-sh/sdk-py@683e827170611bcc512c6a594f124da1731197db
- Branch / Tag: refs/tags/v0.6.28
- Owner: https://github.com/inference-sh
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@683e827170611bcc512c6a594f124da1731197db
- Trigger Event: release

inferencesh 0.6.28

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

inferencesh — python sdk for ai inference api

installation

client usage

with setup parameters

run options

task management

task status values

sessions (stateful execution)

custom session timeout

file upload

agent chat

using a template agent

creating an ad-hoc agent

agent methods

async agent

async client

file handling

creating an app

resources

license

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance