llama-index fastapi server

These details have not been verified by PyPI

Project description

LlamaIndex Server

LlamaIndexServer is a FastAPI-based application that allows you to quickly launch your LlamaIndex Workflows and Agent Workflows as an API server with an optional chat UI. It provides a complete environment for running LlamaIndex workflows with both API endpoints and a user interface for interaction.

Features

Serving a workflow as a chatbot
Built on FastAPI for high performance and easy API development
Optional built-in chat UI with extendable UI components
Prebuilt development code

Installation

pip install llama-index-server

Quick Start

# main.py
from llama_index.core.agent.workflow import AgentWorkflow
from llama_index.core.workflow import Workflow
from llama_index.core.tools import FunctionTool
from llama_index.server import LlamaIndexServer


# Define a factory function that returns a Workflow or AgentWorkflow
def create_workflow() -> Workflow:
    def fetch_weather(city: str) -> str:
        return f"The weather in {city} is sunny"

    return AgentWorkflow.from_tools(
        tools=[
            FunctionTool.from_defaults(
                fn=fetch_weather,
            )
        ]
    )


# Create an API server for the workflow
app = LlamaIndexServer(
    workflow_factory=create_workflow,  # Supports Workflow or AgentWorkflow
    env="dev",  # Enable development mode
    ui_config={ # Configure the chat UI, optional
        "starter_questions": ["What is the weather in LA?", "Will it rain in SF?"],
    },
    verbose=True
)

Running the Server

In the same directory as main.py, run the following command to start the server:
```
fastapi dev
```

Making a request to the server:

curl -X POST "http://localhost:8000/api/chat" -H "Content-Type: application/json" -d '{"message": "What is the weather in Tokyo?"}'

See the API documentation at http://localhost:8000/docs
Access the chat UI at http://localhost:8000/ (Make sure you set the env="dev" or include_ui=True in the server configuration)

Configuration Options

The LlamaIndexServer accepts the following configuration parameters:

workflow_factory: A callable that creates a workflow instance for each request. See Workflow factory contract for more details.
logger: Optional logger instance (defaults to uvicorn logger)
use_default_routers: Whether to include default routers (chat, static file serving)
env: Environment setting ('dev' enables CORS and UI by default)
ui_config: UI configuration as a dictionary or UIConfig object with options:
- enabled: Whether to enable the chat UI (default: True)
- starter_questions: List of starter questions for the chat UI (default: None)
- ui_path: Path for downloaded UI static files (default: ".ui")
- component_dir: The directory for custom UI components rendering events emitted by the workflow. The default is None, which does not render custom UI components.
- layout_dir: The directory for custom layout sections. The default value is layout. See Custom Layout for more details.
- llamacloud_index_selector: Whether to show the LlamaCloud index selector in the chat UI (default: False). Requires LLAMA_CLOUD_API_KEY to be set.
- dev_mode: When enabled, you can update workflow code in the UI and see the changes immediately. It's currently in beta and only supports updating workflow code at app/workflow.py. You might also need to set env="dev" and start the server with the reload feature enabled.
suggest_next_questions: Whether to suggest next questions after the assistant's response (default: True). You can change the prompt for the next questions by setting the NEXT_QUESTION_PROMPT environment variable. The default prompt used is defined in llama_index.server.prompts.SUGGEST_NEXT_QUESTION_PROMPT.
verbose: Enable verbose logging
api_prefix: API route prefix (default: "/api")
server_url: The deployment URL of the server (default is None)

Workflow factory contract

The workflow_factory provided will be called for each chat request to initialize a new workflow instance. Additionally, we provide the ChatRequest object, which includes the request information that is helpful for initializing the workflow. For example:

def create_workflow(chat_request: ChatRequest) -> Workflow:
    # using messages from the chat request to initialize the workflow
    return MyCustomWorkflow(chat_request.messages)

Your workflow will be executed once for each chat request with the following input parameters are included in workflow's StartEvent:

user_msg [str]: The current user message
chat_history [list[ChatMessage]]: All the previous messages of the conversation

Example:

@step
def handle_start_event(ev: StartEvent) -> MyNextEvent:
    user_msg = ev.user_msg
    chat_history = ev.chat_history
    ...

Your workflows can emit UIEvent events to render Custom UI Components in the chat UI to improve the user experience. Furthermore, you can send ArtifactEvent events to render code or document Artifacts in a dedicated Canvas panel in the chat UI.

Default Routers and Features

Chat Router

The server includes a default chat router at /api/chat for handling chat interactions.

Static File Serving

The server automatically mounts the data and output folders at {server_url}{api_prefix}/files/data (default: /api/files/data) and {server_url}{api_prefix}/files/output (default: /api/files/output) respectively.
Your workflows can use both folders to store and access files. As a convention, the data folder is used for documents that are ingested and the output folder is used for documents that are generated by the workflow.
The example workflows from create-llama (see below) are following this pattern.

Chat UI

When enabled, the server provides a chat interface at the root path (/) with:

Configurable starter questions
Real-time chat interface
API endpoint integration

Development Mode

In development mode (env="dev"), the server:

Enables CORS for all origins
Automatically includes the chat UI
Provides more verbose logging

Workflow Editor (Beta)

In development mode, you can set dev_mode to True in the UI configuration to enable the workflow editor, which allows you to edit the workflow code directly in the browser.

app = LlamaIndexServer(
    workflow_factory=create_workflow,
    env="dev",
    ui_config={"dev_mode": True},
)

Note: The workflow editor is currently in beta and only supports updating LlamaIndexServer projects created with create-llama. You also need to start the server via fastapi dev so that the server can hot reload the workflow code.

API Endpoints

The server provides the following default endpoints:

/api/chat: Chat interaction endpoint
/api/files/data/*: Access to data directory files
/api/files/output/*: Access to output directory files

Best Practices

Use environment variables for sensitive configuration
Enable verbose logging during development
Configure CORS appropriately for your deployment environment
Use starter questions to guide users in the chat UI

Getting Started with a New Project

Want to start a new project with LlamaIndexServer? Check out our create-llama tool to quickly generate a new project with LlamaIndexServer.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.27

Jul 15, 2025

0.1.26

Jul 10, 2025

0.1.25

Jul 1, 2025

0.1.24

Jun 19, 2025

0.1.23

Jun 13, 2025

0.1.22

Jun 12, 2025

0.1.21

Jun 6, 2025

0.1.20

Jun 2, 2025

0.1.19

May 29, 2025

This version

0.1.18

May 29, 2025

0.1.17

May 26, 2025

0.1.16

May 16, 2025

0.1.15

Apr 28, 2025

0.1.14

Apr 18, 2025

0.1.13

Apr 16, 2025

0.1.12

Apr 15, 2025

0.1.11

Apr 15, 2025

0.1.10

Apr 10, 2025

0.1.9

Apr 9, 2025

0.1.8

Apr 3, 2025

0.1.7

Apr 2, 2025

0.1.6

Apr 1, 2025

0.1.5

Mar 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_server-0.1.18.tar.gz (3.5 MB view details)

Uploaded May 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llama_index_server-0.1.18-py3-none-any.whl (3.6 MB view details)

Uploaded May 29, 2025 Python 3

File details

Details for the file llama_index_server-0.1.18.tar.gz.

File metadata

Download URL: llama_index_server-0.1.18.tar.gz
Upload date: May 29, 2025
Size: 3.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.8

File hashes

Hashes for llama_index_server-0.1.18.tar.gz
Algorithm	Hash digest
SHA256	`5a41f9d6bc4eb74e9be11ca7bffa00f0dde3ac697bd0e3806fa7027feb5016ac`
MD5	`8490ad39e0ababcda99a035a102f9fbd`
BLAKE2b-256	`336abf858e2ac9885f4a22f26aa8676fa6e3b41d85c90c30728f11055cd6c451`

See more details on using hashes here.

File details

Details for the file llama_index_server-0.1.18-py3-none-any.whl.

File metadata

Download URL: llama_index_server-0.1.18-py3-none-any.whl
Upload date: May 29, 2025
Size: 3.6 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.8

File hashes

Hashes for llama_index_server-0.1.18-py3-none-any.whl
Algorithm	Hash digest
SHA256	`67458ae5e6ffbaa2b003dfcf3bc310fda64ec7ed5207f0a449c514a4c5538343`
MD5	`5df30400ad7da2d393064d177de70dd2`
BLAKE2b-256	`eecf05705a05bb75504ea2743fcb69203ba298ea88c2bcfd916129ba2c7115bf`

See more details on using hashes here.

llama-index-server 0.1.18

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

LlamaIndex Server

Features

Installation

Quick Start

Running the Server

Configuration Options

Workflow factory contract

Default Routers and Features

Chat Router

Static File Serving

Chat UI

Development Mode

Workflow Editor (Beta)

API Endpoints

Best Practices

Getting Started with a New Project

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes