Skip to main content

No project description provided

Project description

Overview

This repository is based on langchain by [Harrison Chase].

aibaba_ai_api helps developers deploy aibaba-ai runnables and chains as a REST API.

This library is integrated with FastAPI and uses pydantic for data validation.

Features

  • Input and Output schemas automatically inferred from your aibaba_ai object, and enforced on every API call, with rich error messages
  • API docs page with JSONSchema and Swagger (insert example link)
  • Efficient /invoke, /batch and /stream endpoints with support for many concurrent requests on a single server
  • /stream_log endpoint for streaming all (or some) intermediate steps from your chain/agent
  • new as of 0.0.40, supports /stream_events to make it easier to stream without needing to parse the output of /stream_log.
  • Playground page at /playground/ with streaming output and intermediate steps
  • All built with battle-tested open-source Python libraries like FastAPI, Pydantic, uvloop and asyncio.
  • Use the client SDK to call a aibaba_ai_api server as if it was a Runnable running locally (or call the HTTP API directly)

Limitations

  • Client callbacks are not yet supported for events that originate on the server
  • Versions of aibaba_ai_api <= 0.2.0, will not generate OpenAPI docs properly when using Pydantic V2 as Fast API does not support mixing pydantic v1 and v2 namespaces. See section below for more details. Either upgrade to aibaba_ai_api>=0.3.0 or downgrade Pydantic to pydantic 1.

Security

  • Vulnerability in Versions 0.0.13 - 0.0.15 -- playground endpoint allows accessing arbitrary files on server. [Resolved in 0.0.16].

Installation

For both client and server:

pip install "aibaba_ai_api[all]"

or pip install "aibaba_ai_api[client]" for client code, and pip install "aibaba_ai_api[server]" for server code.

aibaba_ai CLI 🛠️

Use the aibaba_ai CLI to bootstrap a aibaba_ai_api project quickly.

To use the aibaba_ai CLI make sure that you have a recent version of aibaba_ai_cli installed. You can install it with pip install -U aibaba_ai_cli.

Setup

Note: We use poetry for dependency management. Please follow poetry doc to learn more about it.

1. Create new app using langchain cli command

aibaba-ai app new my-app

2. Define the runnable in add_routes. Go to server.py and edit

add_routes(app. NotImplemented)

3. Use poetry to add 3rd party packages (e.g., aibaba_ai_openai, aibaba-ai-anthropic, aibaba-ai-mistral etc).

poetry add [package-name] // e.g `poetry add aibaba_ai_openai`

4. Set up relevant env variables. For example,

export OPENAI_API_KEY="sk-..."

5. Serve your app

poetry run aibaba_ai serve --port=8100

Examples

Get your aibaba_ai_api instances started quickly with the examples directory.

| Description |

Sample Application

Server

Here's a server that deploys an OpenAI chat model, an Anthropic chat model, and a chain that uses the Anthropic model to tell a joke about a topic.

#!/usr/bin/env python
from fastapi import FastAPI
from aibaba_ai.prompts import ChatPromptTemplate
from aibaba_ai.chat_models import ChatAnthropic, ChatOpenAI
from aibaba_ai_api import add_routes

app = FastAPI(
    title="aibaba-ai Server",
    version="1.0",
    description="A simple api server using aibaba-ai's Runnable interfaces",
)

add_routes(
    app,
    ChatOpenAI(model="gpt-3.5-turbo-0125"),
    path="/openai",
)

add_routes(
    app,
    ChatAnthropic(model="claude-3-haiku-20240307"),
    path="/anthropic",
)

model = ChatAnthropic(model="claude-3-haiku-20240307")
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
add_routes(
    app,
    prompt | model,
    path="/joke",
)

if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="localhost", port=8000)

If you intend to call your endpoint from the browser, you will also need to set CORS headers. You can use FastAPI's built-in middleware for that:

from fastapi.middleware.cors import CORSMiddleware

# Set all CORS enabled origins
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
    expose_headers=["*"],
)

Docs

If you've deployed the server above, you can view the generated OpenAPI docs using:

⚠️ If using aibaba_ai_api <= 0.2.0 and pydantic v2, docs will not be generated for invoke, batch, stream, stream_log. See Pydantic section below for more details. To resolve please upgrade to aibaba_ai_api 0.3.0.

curl localhost:8000/docs

make sure to add the /docs suffix.

⚠️ Index page / is not defined by design, so curl localhost:8000 or visiting the URL will return a 404. If you want content at / define an endpoint @app.get("/").

Client

Python SDK

from aibaba_ai.schema import SystemMessage, HumanMessage
from aibaba_ai.prompts import ChatPromptTemplate
from aibaba_ai.schema.runnable import RunnableMap
from aibaba_ai_api import RemoteRunnable

openai = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")
joke_chain = RemoteRunnable("http://localhost:8000/joke/")

joke_chain.invoke({"topic": "parrots"})

# or async
await joke_chain.ainvoke({"topic": "parrots"})

prompt = [
    SystemMessage(content='Act like either a cat or a parrot.'),
    HumanMessage(content='Hello!')
]

# Supports astream
async for msg in anthropic.astream(prompt):
    print(msg, end="", flush=True)

prompt = ChatPromptTemplate.from_messages(
    [("system", "Tell me a long story about {topic}")]
)

# Can define custom chains
chain = prompt | RunnableMap({
    "openai": openai,
    "anthropic": anthropic,
})

chain.batch([{"topic": "parrots"}, {"topic": "cats"}])

Python using requests:

import requests

response = requests.post(
    "http://localhost:8000/joke/invoke",
    json={'input': {'topic': 'cats'}}
)
response.json()

You can also use curl:

curl --location --request POST 'http://localhost:8000/joke/invoke' \
    --header 'Content-Type: application/json' \
    --data-raw '{
        "input": {
            "topic": "cats"
        }
    }'

Endpoints

The following code:

...
add_routes(
    app,
    runnable,
    path="/my_runnable",
)

adds of these endpoints to the server:

  • POST /my_runnable/invoke - invoke the runnable on a single input
  • POST /my_runnable/batch - invoke the runnable on a batch of inputs
  • POST /my_runnable/stream - invoke on a single input and stream the output
  • POST /my_runnable/stream_log - invoke on a single input and stream the output, including output of intermediate steps as it's generated
  • POST /my_runnable/astream_events - invoke on a single input and stream events as they are generated, including from intermediate steps.
  • GET /my_runnable/input_schema - json schema for input to the runnable
  • GET /my_runnable/output_schema - json schema for output of the runnable
  • GET /my_runnable/config_schema - json schema for config of the runnable

These endpoints match the aibaba-ai Expression Language interface -- please reference this documentation for more details.

Playground

You can find a playground page for your runnable at /my_runnable/playground/. This exposes a simple UI to configure and invoke your runnable with streaming output and intermediate steps.

Widgets

The playground supports widgets and can be used to test your runnable with different inputs. See the widgets section below for more details.

Sharing

In addition, for configurable runnables, the playground will allow you to configure the runnable and share a link with the configuration:

Chat playground

aibaba_ai_api also supports a chat-focused playground that opt into and use under /my_runnable/playground/. Unlike the general playground, only certain types of runnables are supported - the runnable's input schema must be a dict with either:

  • a single key, and that key's value must be a list of chat messages.
  • two keys, one whose value is a list of messages, and the other representing the most recent message.

We recommend you use the first format.

The runnable must also return either an AIMessage or a string.

To enable it, you must set playground_type="chat", when adding your route. Here's an example:

# Declare a chain
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful, professional assistant named Cob."),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | ChatAnthropic(model="claude-2.1")


class InputChat(BaseModel):
    """Input for the chat endpoint."""

    messages: List[Union[HumanMessage, AIMessage, SystemMessage]] = Field(
        ...,
        description="The chat messages representing the current conversation.",
    )


add_routes(
    app,
    chain.with_types(input_type=InputChat),
    enable_feedback_endpoint=True,
    enable_public_trace_link_endpoint=True,
    playground_type="chat",
)

If you are using LangSmith, you can also set enable_feedback_endpoint=True on your route to enable thumbs-up/thumbs-down buttons after each message, and enable_public_trace_link_endpoint=True to add a button that creates a public traces for runs. Note that you will also need to set the following environment variables:

export AIAGENTSFORCE_TRACING_V2="true"
export AIAGENTSFORCE_PROJECT="YOUR_PROJECT_NAME"
export AIAGENTSFORCE_API_KEY="YOUR_API_KEY"

Here's an example with the above two options turned on:

Note: If you enable public trace links, the internals of your chain will be exposed. We recommend only using this setting for demos or testing.

Legacy Chains

aibaba_ai_api works with both Runnables (constructed via aibaba_ai Expression Language) and legacy chains (inheriting from Chain). However, some of the input schemas for legacy chains may be incomplete/incorrect, leading to errors. This can be fixed by updating the input_schema property of those chains in aibaba-ai. If you encounter any errors, please open an issue on THIS repo, and we will work to address it.

Deployment

Deploy to AWS

You can deploy to AWS using the AWS Copilot CLI

copilot init --app [application-name] --name [service-name] --type 'Load Balanced Web Service' --dockerfile './Dockerfile' --deploy

Click here to learn more.

Deploy to Azure

You can deploy to Azure using Azure Container Apps (Serverless):

az containerapp up --name [container-app-name] --source . --resource-group [resource-group-name] --environment  [environment-name] --ingress external --target-port 8001 --env-vars=OPENAI_API_KEY=your_key

You can find more info here

Deploy to GCP

You can deploy to GCP Cloud Run using the following command:

gcloud run deploy [your-service-name] --source . --port 8001 --allow-unauthenticated --region us-central1 --set-env-vars=OPENAI_API_KEY=your_key

Community Contributed

Deploy to Railway

Example Railway Repo

Deploy on Railway

Advanced

Handling Authentication

If you need to add authentication to your server, please read Fast API's documentation about dependencies and security.

The below examples show how to wire up authentication logic aibaba-ai-api endpoints using FastAPI primitives.

You are responsible for providing the actual authentication logic, the users table etc.

If you're not sure what you're doing, you could try using an existing solution Auth0.

Using add_routes

If you're using add_routes, see examples here.

Per User

If you need authorization or logic that is user dependent, specify per_req_config_modifier when using add_routes. Use a callable receives the raw Request object and can extract relevant information from it for authentication and authorization purposes.

Using APIHandler

If you feel comfortable with FastAPI and python, you can use aibaba_ai_api's APIHandler.

| Description |

Files

LLM applications often deal with files. There are different architectures that can be made to implement file processing; at a high level:

  1. The file may be uploaded to the server via a dedicated endpoint and processed using a separate endpoint
  2. The file may be uploaded by either value (bytes of file) or reference (e.g., s3 url to file content)
  3. The processing endpoint may be blocking or non-blocking
  4. If significant processing is required, the processing may be offloaded to a dedicated process pool

You should determine what is the appropriate architecture for your application.

Currently, to upload files by value to a runnable, use base64 encoding for the file (multipart/form-data is not supported yet).

Here's an example that shows how to use base64 encoding to send a file to a remote runnable.

Remember, you can always upload files by reference (e.g., s3 url) or upload them as multipart/form-data to a dedicated endpoint.

Custom Input and Output Types

Input and Output types are defined on all runnables.

You can access them via the input_schema and output_schema properties.

aibaba_ai_api uses these types for validation and documentation.

If you want to override the default inferred types, you can use the with_types method.

Here's a toy example to illustrate the idea:

from typing import Any

from fastapi import FastAPI
from langchain.schema.runnable import RunnableLambda

app = FastAPI()


def func(x: Any) -> int:
    """Mistyped function that should accept an int but accepts anything."""
    return x + 1


runnable = RunnableLambda(func).with_types(
    input_type=int,
)

add_routes(app, runnable)

Custom User Types

Inherit from CustomUserType if you want the data to de-serialize into a pydantic model rather than the equivalent dict representation.

At the moment, this type only works server side and is used to specify desired decoding behavior. If inheriting from this type the server will keep the decoded type as a pydantic model instead of converting it into a dict.

from fastapi import FastAPI
from langchain.schema.runnable import RunnableLambda

from aibaba_ai_api import add_routes
from aibaba_ai_api.schema import CustomUserType

app = FastAPI()


class Foo(CustomUserType):
    bar: int


def func(foo: Foo) -> int:
    """Sample function that expects a Foo type which is a pydantic model"""
    assert isinstance(foo, Foo)
    return foo.bar


# Note that the input and output type are automatically inferred!
# You do not need to specify them.
# runnable = RunnableLambda(func).with_types( # <-- Not needed in this case
#     input_type=Foo,
#     output_type=int,
#
add_routes(app, RunnableLambda(func), path="/foo")

Playground Widgets

The playground allows you to define custom widgets for your runnable from the backend.

Schema

  • A widget is specified at the field level and shipped as part of the JSON schema of the input type
  • A widget must contain a key called type with the value being one of a well known list of widgets
  • Other widget keys will be associated with values that describe paths in a JSON object
type JsonPath = number | string | (number | string)[];
type NameSpacedPath = { title: string; path: JsonPath }; // Using title to mimick json schema, but can use namespace
type OneOfPath = { oneOf: JsonPath[] };

type Widget = {
  type: string; // Some well known type (e.g., base64file, chat etc.)
  [key: string]: JsonPath | NameSpacedPath | OneOfPath;
};

Available Widgets

There are only two widgets that the user can specify manually right now:

  1. File Upload Widget
  2. Chat History Widget

See below more information about these widgets.

All other widgets on the playground UI are created and managed automatically by the UI based on the config schema of the Runnable. When you create Configurable Runnables, the playground should create appropriate widgets for you to control the behavior.

File Upload Widget

Allows creation of a file upload input in the UI playground for files that are uploaded as base64 encoded strings. Here's the full example.

Snippet:

try:
    from pydantic.v1 import Field
except ImportError:
    from pydantic import Field

from aibaba_ai_api import CustomUserType


# ATTENTION: Inherit from CustomUserType instead of BaseModel otherwise
#            the server will decode it into a dict instead of a pydantic model.
class FileProcessingRequest(CustomUserType):
    """Request including a base64 encoded file."""

    # The extra field is used to specify a widget for the playground UI.
    file: str = Field(..., extra={"widget": {"type": "base64file"}})
    num_chars: int = 100

Example widget:

Chat Widget

Look at the widget example.

To define a chat widget, make sure that you pass "type": "chat".

  • "input" is JSONPath to the field in the Request that has the new input message.
  • "output" is JSONPath to the field in the Response that has new output message(s).
  • Don't specify these fields if the entire input or output should be used as they are ( e.g., if the output is a list of chat messages.)

Here's a snippet:

class ChatHistory(CustomUserType):
    chat_history: List[Tuple[str, str]] = Field(
        ...,
        examples=[[("human input", "ai response")]],
        extra={"widget": {"type": "chat", "input": "question", "output": "answer"}},
    )
    question: str


def _format_to_messages(input: ChatHistory) -> List[BaseMessage]:
    """Format the input to a list of messages."""
    history = input.chat_history
    user_input = input.question

    messages = []

    for human, ai in history:
        messages.append(HumanMessage(content=human))
        messages.append(AIMessage(content=ai))
    messages.append(HumanMessage(content=user_input))
    return messages


model = ChatOpenAI()
chat_model = RunnableParallel({"answer": (RunnableLambda(_format_to_messages) | model)})
add_routes(
    app,
    chat_model.with_types(input_type=ChatHistory),
    config_keys=["configurable"],
    path="/chat",
)

Example widget:

You can also specify a list of messages as your a parameter directly, as shown in this snippet:

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assisstant named Cob."),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | ChatAnthropic(model="claude-2.1")


class MessageListInput(BaseModel):
    """Input for the chat endpoint."""
    messages: List[Union[HumanMessage, AIMessage]] = Field(
        ...,
        description="The chat messages representing the current conversation.",
        extra={"widget": {"type": "chat", "input": "messages"}},
    )


add_routes(
    app,
    chain.with_types(input_type=MessageListInput),
    path="/chat",
)

See this sample file for an example.

Enabling / Disabling Endpoints (aibaba_ai_api >=0.0.33)

You can enable / disable which endpoints are exposed when adding routes for a given chain.

Use enabled_endpoints if you want to make sure to never get a new endpoint when upgrading aibaba_ai_api to a newer verison.

Enable: The code below will only enable invoke, batch and the corresponding config_hash endpoint variants.

add_routes(app, chain, enabled_endpoints=["invoke", "batch", "config_hashes"], path="/mychain")

Disable: The code below will disable the playground for the chain

add_routes(app, chain, disabled_endpoints=["playground"], path="/mychain")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aibaba_ai_api-0.3.3.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aibaba_ai_api-0.3.3-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file aibaba_ai_api-0.3.3.tar.gz.

File metadata

  • Download URL: aibaba_ai_api-0.3.3.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.10.12 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for aibaba_ai_api-0.3.3.tar.gz
Algorithm Hash digest
SHA256 84d7295dcbb1e4a2e60041cc07208c07ce5187a3d6a78ed83e30515f8504c0d4
MD5 e9dc76d12cdd0363053b14790ec9b958
BLAKE2b-256 d1a318dcb30a8b531ccf6ea9eb84d6512a6e8384004bc98226deeded4e9091b1

See more details on using hashes here.

File details

Details for the file aibaba_ai_api-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: aibaba_ai_api-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.10.12 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for aibaba_ai_api-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 043f85ebd78a87761784cf11ee1953441f5cb774c2d4b52fc8e6f590e2675a32
MD5 36c7705bcde0b4239d1ba52ea0b453ce
BLAKE2b-256 9b2b48f41d7de168492f2cdef00a5b46aee8834e5eb105b53b3e597ea069cb70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page