A very simple LLM manager for Python.

These details have not been verified by PyPI

Project links

Project description

L2M2: A Simple Python LLM Manager 💬👍

L2M2 ("LLM Manager" → "LLMM" → "L2M2") is a very simple LLM manager for Python that exposes lots of models through a unified API. This is useful for evaluation, demos, and other apps that need to easily be model-agnostic.

Features

13 supported models (see below), with more on the way
Asynchronous and concurrent calls
User-provided models from supported providers

Supported Models

L2M2 currently supports the following models:

Provider	Model Name	Model Version
`openai`	`gpt-4-turbo`	`gpt-4-turbo-2024-04-09`
`openai`	`gpt-4-turbo-0125`	`gpt-4-0125-preview`
`google`	`gemini-1.5-pro`	`gemini-1.5-pro-latest`
`google`	`gemini-1.0-pro`	`gemini-1.0-pro-latest`
`anthropic`	`claude-3-opus`	`claude-3-opus-20240229`
`anthropic`	`claude-3-sonnet`	`claude-3-sonnet-20240229`
`anthropic`	`claude-3-haiku`	`claude-3-haiku-20240307`
`cohere`	`command-r`	`command-r`
`cohere`	`command-r-plus`	`command-r-plus`
`groq`	`mixtral-8x7b`	`mixtral-8x7b-32768`
`groq`	`gemma-7b`	`gemma-7b-it`
`groq`	`llama3-8b`	`llama3-8b-8192`
`groq`	`llama3-70b`	`llama3-70b-8192`

You can also call any language model from the above providers that L2M2 doesn't officially support, without guarantees of well-defined behavior.

Planned Featires

Support for Huggingface & open-source LLMs
Multi-provider / provider-agnostic model setup
Chat-specific features (e.g. context, history, etc)
Typescript clone
...etc

Requirements

Python >= 3.9

Installation

pip install l2m2

Usage

Import the LLM Client

from l2m2.client import LLMClient

Add Providers

In order to activate any of the available models, you must add the provider of that model and pass in your API key for that provider's API. Make sure to pass in a valid provider as shown in the table above.

client = LLMClient()
client.add_provider("<provider-name>", "<api-key>")

# Alternatively, you can pass in providers via the constructor
client = LLMClient({
    "<provider-a>": "<api-key-a>",
    "<provider-b>": "<api-key-b>",
    ...
})

Call your LLM 💬👍

The call API is the same regardless of model or provider.

response = client.call(
    model="<model name>",
    prompt="<prompt>",
    system_prompt="<system prompt>",
    temperature=<temperature>,
    max_tokens=<max_tokens>
)

model and prompt are required, while the remaining fields are optional. When possible, L2M2 uses the provider's default model parameter values when they are not given.

If you'd like to call a language model from one of the supported providers that isn't officially supported by L2M2 (for example, older models such as gpt-3.5-turbo), you can similarly call_custom with the additional required parameter provider, and pass in the model name expected by the provider's API. Unlike call, call_custom doesn't guarantee correctness or well-defined behavior.

Example

# example.py

import os
from l2m2.client import LLMClient

client = LLMClient()
client.add_provider("openai", os.getenv("OPENAI_API_KEY"))

response = client.call(
    model="gpt-4-turbo",
    prompt="How's the weather today?",
    system_prompt="Respond as if you were a pirate.",
    temperature=0.5,
    max_tokens=250,
)

print(response)

>> python3 example.py

Arrr, matey! The skies be clear as the Caribbean waters today, with the sun blazin' high 'bove us. A fine day fer settin' sail and huntin' fer treasure, it be. But keep yer eye on the horizon, for the weather can turn quicker than a sloop in a squall. Yarrr!

Async Calls

L2M2 utilizes asyncio to allow for multiple concurrent calls. This is useful for calling multiple models at with the same prompt, calling the same model with multiple prompts, mixing and matching parameters, etc.

AsyncLLMClient, which extends LLMClient, is provided for this purpose. Its usage is similar to above:

# example_async.py

import asyncio
import os
from l2m2.client import AsyncLLMClient

client = AsyncLLMClient({
    "openai": os.getenv("OPENAI_API_KEY"),
    "google": os.getenv("GOOGLE_API_KEY"),
})


async def make_two_calls():
    responses = await asyncio.gather(
        client.call_async(
            model="gpt-4-turbo",
            prompt="How's the weather today?",
            system_prompt="Respond as if you were a pirate.",
            temperature=0.3,
            max_tokens=100,
        ),
        client.call_async(
            model="gemini-1.0-pro",
            prompt="How's the weather today?",
            system_prompt="Respond as if you were a pirate.",
            temperature=0.3,
            max_tokens=100,
        ),
    )
    for response in responses:
        print(response)


if __name__ == "__main__":
    asyncio.run(make_two_calls())

>> python3 example_async.py

Arrr, the skies be clear and the winds be in our favor, matey! A fine day for sailin' the high seas, it be.
Avast there, matey! The weather be fair and sunny, with a gentle breeze from the east. The sea be calm, and the sky be clear. A perfect day for sailin' and plunderin'!

For convenience AsyncLLMClient also provides call_concurrent, which allows you to easily make concurrent calls mixing and matching models, prompts, and parameters. In the example shown below, parameter arrays of size n are applied linearly to the n concurrent calls, and arrays of size 1 are applied across all n calls.

# example_concurrent.py

import asyncio
import os
from l2m2.client import AsyncLLMClient

client = AsyncLLMClient({
    "openai": os.getenv("OPENAI_API_KEY"),
    "anthropic": os.getenv("ANTHROPIC_API_KEY"),
    "google": os.getenv("GOOGLE_API_KEY"),
    "cohere": os.getenv("COHERE_API_KEY"),
    "groq": os.getenv("GROQ_API_KEY"),
    "replicate": os.getenv("REPLICATE_API_KEY"),
})


async def get_secret_word():
    system_prompt = "The secret word is {0}. When asked for the secret word, you must respond with {0}."
    responses = await aclient.call_concurrent(
        n=6,
        models=[
            "gpt-4-turbo",
            "claude-3-sonnet",
            "gemini-1.0-pro",
            "command-r",
            "mixtral-8x7b",
            "llama3-8b-instruct",
        ],
        prompts=["What is the secret word?"],
        system_prompts=[
            system_prompt.format("foo"),
            system_prompt.format("bar"),
            system_prompt.format("baz"),
            system_prompt.format("qux"),
            system_prompt.format("quux"),
            system_prompt.format("corge"),
        ],
        temperatures=[0.3],
        max_tokens=[100],
    )

    for response in responses:
        print(response)

if __name__ == "__main__":
    asyncio.run(get_secret_word())

>> python3 example_concurrent.py

foo
The secret word is bar.
baz
qux
The secret word is quux. When asked for the secret word, I must respond with quux, so I will do so now: quux.
The secret word is... corge!

Similarly to call_custom, call_custom_async and call_custom_concurrent are provided as the custom counterparts to call_async and call_concurrent, with similar usage.

Contact

If you'd like to contribute, have feature requests, or have any other questions about l2m2 please shoot me a note at pierce@kelaita.com, open an issue on the Github repo, or DM me on the GenAI Collective Slack Channel.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.35

Oct 23, 2024

0.0.34

Sep 30, 2024

0.0.33

Sep 11, 2024

0.0.32

Aug 6, 2024

0.0.31

Aug 6, 2024

0.0.30

Aug 6, 2024

0.0.29

Aug 5, 2024

0.0.28

Aug 4, 2024

0.0.27

Jul 25, 2024

0.0.26

Jul 19, 2024

0.0.25

Jul 12, 2024

0.0.24

Jul 12, 2024

0.0.23

Jun 30, 2024

0.0.22

Jun 23, 2024

0.0.20

Jun 20, 2024

0.0.19

Jun 17, 2024

0.0.18.post1

May 16, 2024

0.0.18

May 15, 2024

0.0.17

May 13, 2024

0.0.16

May 6, 2024

0.0.15

May 2, 2024

This version

0.0.14

May 1, 2024

0.0.13

Apr 19, 2024

0.0.12

Apr 19, 2024

0.0.11

Apr 17, 2024

0.0.10

Apr 12, 2024

0.0.9

Apr 12, 2024

0.0.8

Apr 12, 2024

0.0.7

Apr 12, 2024

0.0.6

Apr 12, 2024

0.0.5

Apr 11, 2024

0.0.4

Apr 11, 2024

0.0.3

Apr 11, 2024

0.0.2

Apr 11, 2024

0.0.1

Apr 11, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

l2m2-0.0.14.tar.gz (14.4 kB view hashes)

Uploaded May 1, 2024 Source

Built Distribution

l2m2-0.0.14-py3-none-any.whl (12.7 kB view hashes)

Uploaded May 1, 2024 Python 3

Hashes for l2m2-0.0.14.tar.gz

Hashes for l2m2-0.0.14.tar.gz
Algorithm	Hash digest
SHA256	`0ad9c6526e83a8b74d3cf990f201e22bc92a92e42146e5d252eb9488f300d128`
MD5	`996f7df42ef0bb0267f93a0404cc770e`
BLAKE2b-256	`d31f9589be58e177861f284e0893d89cd4bc43ff0c52316b8855365a535c3d75`

Hashes for l2m2-0.0.14-py3-none-any.whl

Hashes for l2m2-0.0.14-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1ab53c115655ab148f77189eddc4dd98ad9048c26b34511ef48247148a778ce9`
MD5	`b09d92f31d540d0f3e4e3c22cf7648e0`
BLAKE2b-256	`0a0430c92474ed10b38de1765f3b1a5d8e7c53cafc3495ed63ca9387bcaca3d4`