Skip to main content

Open-source, OpenAI-compatible API server with pluggable providers for any model and any infrastructure

Project description

Llama Stack

PyPI Version PyPI Downloads Docker Hub Pulls License Discord Unit Tests Integration Tests OpenResponses Conformance Ask DeepWiki

Quick Start | Documentation | OpenAI API Compatibility | Discord

Open-source agentic API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.

Llama Stack Architecture

Llama Stack is a drop-in replacement for the OpenAI API that you can run anywhere — your laptop, your datacenter, or the cloud. Use any OpenAI-compatible client or agentic framework. Swap between Llama, GPT, Gemini, Mistral, or any model without changing your application code.

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8321/v1", api_key="fake")
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello"}],
)

What you get

  • Chat Completions & Embeddings — standard /v1/chat/completions, /v1/completions, and /v1/embeddings endpoints, compatible with any OpenAI client
  • Responses API — server-side agentic orchestration with tool calling, MCP server integration, and built-in file search (RAG) in a single API call (learn more)
  • Vector Stores & Files/v1/vector_stores and /v1/files for managed document storage and search
  • Batches/v1/batches for offline batch processing
  • Open Responses conformant — the Responses API implementation passes the Open Responses conformance test suite

Use any model, use any infrastructure

Llama Stack has a pluggable provider architecture. Develop locally with Ollama, deploy to production with vLLM, or connect to a managed service — the API stays the same.

See the provider documentation for the full list.

Get started

Install and run a Llama Stack server:

# One-line install
curl -LsSf https://github.com/llamastack/llama-stack/raw/main/scripts/install.sh | bash

# Or install via uv
uv pip install llama-stack

# Start the server (uses the starter distribution with Ollama)
llama stack run

Then connect with any OpenAI client — Python, TypeScript, curl, or any framework that speaks the OpenAI API.

See the Quick Start guide for detailed setup.

Resources

Client SDKs:

Language SDK Package
Python llama-stack-client-python PyPI version
TypeScript llama-stack-client-typescript NPM version

Community

We hold regular community calls every Thursday at 09:00 AM PST — see the Community Event on Discord for details.

Star History Chart

Thanks to all our amazing contributors!

Llama Stack contributors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ogx-0.7.1.tar.gz (15.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ogx-0.7.1-py3-none-any.whl (782.6 kB view details)

Uploaded Python 3

File details

Details for the file ogx-0.7.1.tar.gz.

File metadata

  • Download URL: ogx-0.7.1.tar.gz
  • Upload date:
  • Size: 15.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ogx-0.7.1.tar.gz
Algorithm Hash digest
SHA256 96ecfb720da97eeb026ca2e6d01c972deddd646a2fff71e0703ff7e5820383ca
MD5 ae77c996f5f3d56163edaf489dc68315
BLAKE2b-256 26537d853e66f26db479504a2d0da3da6dd7960cb9e2eecb1cb2aa25c9ee3c3a

See more details on using hashes here.

File details

Details for the file ogx-0.7.1-py3-none-any.whl.

File metadata

  • Download URL: ogx-0.7.1-py3-none-any.whl
  • Upload date:
  • Size: 782.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ogx-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b7cf5bc8e6554cee432195aeaa9dc57c740ceca49c16012b8232e14d3c15eeed
MD5 3903b947461cd3dbde45aff7579369af
BLAKE2b-256 ffdb3428a789ad4cc3c2618a9552294a11b89851233535c9d80cf0cdb3564e9f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page