Skip to main content

Python SDK for Vortelio — run LLMs, images, audio, video & 3D locally. OpenAI & Ollama API compatible.

Project description

Vortelio Python SDK

PyPI version Python 3.8+ License

Official Python client for Vortelio — run LLMs, generate images, audio, video, and 3D models locally.

Zero external dependencies. Fully OpenAI API and Ollama API compatible.

pip install vortelio

For async support:

pip install "vortelio[async]"   # adds aiohttp

Prerequisites

Start the Vortelio server first:

vortelio serve          # default port 11500

Or let the SDK auto-start it:

from vortelio import ensure_server
ensure_server()          # finds and starts vortelio if installed

Quick Start

from vortelio import Vortelio

ai = Vortelio()          # connects to http://localhost:11500

# Download a model
ai.pull("llm/mistral:7b")

# Chat — streams tokens to stdout, returns full reply
reply = ai.chat("llm/mistral:7b", "What is quantum computing?")

# Generator streaming
for token in ai.chat_stream("llm/mistral:7b", "Tell me a story"):
    print(token, end="", flush=True)
print()

Chat & Conversations

# Simple chat
reply = ai.chat("llm/mistral:7b", "Hello!")

# With messages list (Ollama/OpenAI format)
reply = ai.chat("llm/mistral:7b", [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user",   "content": "What is 2 + 2?"},
])

# Stateful multi-turn conversation
conv = ai.conversation("llm/mistral:7b", system="You are a pirate.")
conv.say("What is your name?")
reply = conv.say("Where do you sail?")

# Streaming from a conversation
for tok in conv.stream("Tell me about treasure"):
    print(tok, end="", flush=True)

Generate (Ollama-style)

# Non-streaming
result = ai.generate("llm/mistral:7b", "The capital of France is")
print(result["response"])

# Streaming generator
for tok in ai.generate_stream("llm/mistral:7b", "Count to 10"):
    print(tok, end="", flush=True)

# With options
result = ai.generate(
    "llm/mistral:7b",
    "Explain photosynthesis",
    system="You are a biology teacher.",
    options={"temperature": 0.7, "num_ctx": 4096},
    think=True,   # chain-of-thought with <think> models
)
print(result.get("thinking", ""))
print(result["response"])

Embeddings

# Batch embeddings
vecs = ai.embed("llm/nomic-embed-text:latest", ["Hello", "World"])
# → [[0.1, 0.2, ...], [0.3, 0.4, ...]]

# Legacy single-prompt
vec = ai.embeddings("llm/nomic-embed-text:latest", "Hello world")

RAG (Retrieval-Augmented Generation)

# Ingest documents
ai.rag_ingest(
    "llm/nomic-embed-text:latest",
    [
        {"text": "Paris is the capital of France.", "meta": {"source": "facts"}},
        {"text": "Berlin is the capital of Germany.", "meta": {"source": "facts"}},
    ],
    collection="my-docs",
)

# Query
hits = ai.rag_query("llm/nomic-embed-text:latest", "capital of France", collection="my-docs")
for h in hits["results"]:
    print(f"[{h['score']:.3f}] {h['text']}")

Model Management

ai.models()                         # list all downloaded models
ai.pull("llm/llama3:8b")            # download from HuggingFace
ai.show("llm/mistral:7b")           # model details, template, capabilities
ai.delete("llm/old-model:latest")   # remove a model
ai.copy("llm/mistral:7b", "llm/my-mistral:latest")  # duplicate
ai.quantize("llm/mistral:7b", "q4_k_m")             # quantize
ai.create("llm/my-model:latest", from_model="llm/mistral:7b",
          system="You are a helpful assistant.")
ai.ps()                             # currently loaded models
ai.version()                        # server version

Media Generation

# Image
ai.image("image/sdxl:latest", "a red panda on the moon", "panda.png")

# Or get bytes directly
png_bytes = ai.generate_image("image/sdxl:latest", "sunset over mountains")

# Audio (TTS / music)
wav_bytes = ai.generate_audio("audio/kokoro:latest", "Hello, this is a test.")

# Video
mp4_bytes = ai.generate_video("video/wan2-1:latest", "a cat playing piano")

# 3D
obj_bytes = ai.generate_3d("3d/triposr:latest", "a wooden chair")

Advanced API

# A/B compare models
result = ai.compare(
    ["llm/mistral:7b", "llm/llama3:8b"],
    "Explain gravity in one sentence.",
)
for r in result["results"]:
    print(f"{r['model']}: {r['response']}")

# Structured JSON output
result = ai.structured(
    "llm/mistral:7b",
    "List 3 programming languages",
    schema={"type": "array", "items": {"type": "string"}},
)
print(result["parsed"])

# Long-text summarization (map-reduce)
summary = ai.summarize("llm/mistral:7b", very_long_text, style="bullets")
print(summary["summary"])

# Chain-of-thought
result = ai.think("llm/qwq:32b", "Is 97 a prime number?")
print("Reasoning:", result["thinking"])
print("Answer:", result["answer"])

# Smart model router
best = ai.route("code", prompt="Write a sorting algorithm")
print("Best model:", best["model"])

OpenAI-Compatible API

# Drop-in OpenAI replacement
response = ai.openai_chat(
    "mistral:7b",
    [{"role": "user", "content": "Hello!"}],
    temperature=0.7,
)
print(response["choices"][0]["message"]["content"])

# Streaming
for tok in ai.openai_chat_stream("mistral:7b", [{"role":"user","content":"Hi"}]):
    print(tok, end="", flush=True)

# Embeddings (OpenAI format)
result = ai.openai_embeddings("nomic-embed-text:latest", "Hello world")

Async Client

import asyncio
from vortelio import AsyncVortelio

async def main():
    ai = AsyncVortelio()

    # All methods are async
    reply = await ai.chat("llm/mistral:7b", "Hello!")

    # Async streaming
    async for tok in ai.chat_stream("llm/mistral:7b", "Tell me a joke"):
        print(tok, end="", flush=True)

    # Async conversation
    conv = ai.conversation("llm/mistral:7b", system="You are helpful.")
    reply = await conv.say("My name is Alice.")

asyncio.run(main())

Agents

# List available agents (Open WebUI, OpenClaw, CrewAI, AnythingLLM, ...)
catalog = ai.agents_catalog()

# Install and start an agent
ai.agents_install("open-webui")
ai.agents_start("open-webui")

# Stop an agent
ai.agents_stop("open-webui")

Webhooks & Audit

# Register a webhook
ai.hooks_create("https://my-server.com/webhook", event="generate")

# List webhooks
ai.hooks_list()

# Audit log
entries = ai.audit(limit=50)

GGUF Inspect & Ollama Import

# Inspect a local GGUF file
info = ai.gguf_inspect("/path/to/model.gguf")

# Import models from a local Ollama installation
ai.import_ollama()   # imports all
ai.import_ollama(["mistral:7b", "llama3:8b"])  # selective

Custom Port / Remote Server

ai = Vortelio(host="http://192.168.1.100", port=11500)
ai = Vortelio(port=8080)               # local custom port
ai = Vortelio(timeout=600)             # longer timeout for large models

Server Version Compatibility

This SDK version 0.3.49 requires Vortelio server ≥ 0.3.38.


License

Apache 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vortelio-0.3.49.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vortelio-0.3.49-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file vortelio-0.3.49.tar.gz.

File metadata

  • Download URL: vortelio-0.3.49.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for vortelio-0.3.49.tar.gz
Algorithm Hash digest
SHA256 2d60e3731440bef77da7026e925944efa44741994f51408aa5db1d964ea9dfa8
MD5 4f57f8dbe65fb30108440f2545a015b7
BLAKE2b-256 4df865d591edbf9e5a7191bac0fef59b95079326901da46cde215da9fab73db3

See more details on using hashes here.

File details

Details for the file vortelio-0.3.49-py3-none-any.whl.

File metadata

  • Download URL: vortelio-0.3.49-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for vortelio-0.3.49-py3-none-any.whl
Algorithm Hash digest
SHA256 4617ba343d6a8e4ada57825ec3fceb7858f9c01e6d58064924c2530ca7ee1588
MD5 f58d1123d6ee61ce93db1f485b798b5b
BLAKE2b-256 1af3cd1c2f993141d41d5fb6acbe32af91b8fe7a0ee3474bdafef320fef2aeac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page