Skip to main content

Friendli Suite Client

Project description

Friendli Logo

Supercharge Generative AI Serving with Friendli 🚀

CI Status Python Version PyPi Package Version Documentation License

Welcome to Friendli Suite, the ultimate solution for serving generative AI models. We offer three distinct options to cater to your specific needs, each designed to provide superior performance, cost-effectiveness, and ease of use.

Friendli Suite

1. Friendli Serverless Endpoints

Imagine a playground for your AI dreams. Friendli Serverless Endpoint is just that - a simple, click-and-play interface that lets you access popular open-source models like Llama-2 and Stable Diffusion without any heavy lifting. Choose your model, enter your prompt or upload an image, and marvel at the generated text, code, or image outputs. With pay-per-token billing, this is ideal for exploration and experimentation. You can think of it as an AI sampler.

2. Friendli Dedicated Endpoints

Ready to take the reins and unleash the full potential of your own models? Friendli Dedicated Endpoint is for you. This service provides dedicated GPU resources in the cloud platform of your choice (AWS, GCP, Azure), letting you upload and run your custom generative AI models. Reserve the exact GPU you need (A10, A100 40G, A100 80G, etc.) and enjoy fine-grained control over your model settings. Pay-per-second billing makes it perfect for regular or resource-intensive workloads.

3. Friendli Container

Do you prefer the comfort and security of your own data center? Friendli Container is the solution. We provide the Friendli Engine within Docker containers that can be installed on your on-premise GPUs so your data stays within your own secure cluster. This option offers maximum control and security, ideal for advanced users or those with specific data privacy requirements.

[!NOTE]

The Friendli Engine: The Powerhouse Behind the Suite

At the heart of each Friendli Suite offering lies the Friendli Engine, a patented, GPU-optimized serving engine. This technological marvel is what enables Friendli Suite's superior performance and cost-effectiveness, featuring innovations like continuous batching (iteration batching) that significantly improve resource utilization compared to traditional LLM serving solutions.

🕹️ Friendli Client

Installation

pip install friendli-client

[!NOTE] If you have a Hugging Face checkpoint and want to convert it to a Friendli-compatible format with applying quantization, you need to install the package with the necessary machine learing library (mllib) dependencies. In this case, install the package with the following command:

pip install "friendli-client[mllib]"

Python SDK Examples

[!IMPORTANT] You must set FRIENDLI_TOKEN environment variable before initializing the client instance with client = Friendli(). Alternatively, you can provide the value of your personal access token as the token argument when creating the client, like so:

from friendli import Friendli

client = Friendli(token="YOUR PERSONAL ACCESS TOKEN")

Default

from friendli import Friendli

client = Friendli()

chat_completion = client.chat.completions.create(
    model="llama-2-13b-chat",
    messages=[
        {
            "role": "user",
            "content": "Tell me how to make a delicious pancake"
        }
    ],
    stream=False,
)
print(chat_completion.choices[0].message.content)

Streaming

from friendli import Friendli

client = Friendli()

stream = client.chat.completions.create(
    model="llama-2-13b-chat",
    messages=[
        {
            "role": "user",
            "content": "Tell me how to make a delicious pancake"
        }
    ]
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Async

import asyncio
from friendi import AsyncFriendli

client = AsyncFriendli()


async def main() -> None:
    chat_completion = await client.chat.completions.create(
        model="llama-2-13b-chat",
        messages=[
            {
                "role": "user",
                "content": "Tell me how to make a delicious pancake"
            }
        ]
        stream=False,
    )
    print(chat_completion.choices[0].message.content)


asyncio.run(main())

Streaming (Async)

import asyncio
from friendi import AsyncFriendli

client = AsyncFriendli()


async def main() -> None:
    stream = await client.chat.completions.create(
        model="llama-2-13b-chat",
        messages=[
            {
                "role": "user",
                "content": "Tell me how to make a delicious pancake"
            }
        ]
        stream=True,
    )
    async for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="")


asyncio.run(main())

CLI Examples

You can also call the generation APIs directly with CLI.

friendli api chat-completions create \
  -g "user Tell me how to make a delicious pancake" \
  -m llama-2-13b-chat

For further information about the friendli command, run friendli --help in your terminal shell. This will provide you with a detailed list of available options and usage instructions.

[!TIP] > Check out our official documentations to learn more!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

friendli_client-2.0.0a15.tar.gz (6.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

friendli_client-2.0.0a15-py3-none-any.whl (10.5 MB view details)

Uploaded Python 3

File details

Details for the file friendli_client-2.0.0a15.tar.gz.

File metadata

  • Download URL: friendli_client-2.0.0a15.tar.gz
  • Upload date:
  • Size: 6.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for friendli_client-2.0.0a15.tar.gz
Algorithm Hash digest
SHA256 5445efde5d76a99fd3918eb45c069a0c41b315115abc59fe4859bde42992ec8a
MD5 50a506ae28fd0e1488d853ebc719a93a
BLAKE2b-256 0743c1e54535b064f51c00384a7dabd66bd788587dec841d9e09d6af1fc3ec34

See more details on using hashes here.

File details

Details for the file friendli_client-2.0.0a15-py3-none-any.whl.

File metadata

File hashes

Hashes for friendli_client-2.0.0a15-py3-none-any.whl
Algorithm Hash digest
SHA256 23e01a368df7550d2b1ed21eb92e633db82f80b76169522267b1af3255a0d466
MD5 f84cda55e4093b61bc832c3335ec7b22
BLAKE2b-256 946208568699a998730ec3df2df3e09996e3fc4c20de1392dbb67543bb7f0791

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page