Skip to main content

SDK for using LLM

Project description

LLM-Client-SDK

Test License: MIT

LLM-Client-SDK is an SDK for communicating with generative AI large language models (We currently support - OpenAI, AI21, HuggingfaceHub, Aleph Alpha, Local models with transformers - and many more soon).

Our vision is to provide async native and production ready SDK while creating a powerful and fast integration with different LLM without letting the user lose any flexibility (API params, endpoints etc.). *We also provide sync version, see more details below in Usage section.

Base Interface

The package expose two simple interface for communicating with LLMs (In the future we will expend the interface to support more tasks like embeddings, list models, edits, etc. and we will add a standardized for LLMs param like max_tokens, temperature, etc.):

from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from typing import Any
from aiohttp import ClientSession


class BaseLLMClient(ABC):
    @abstractmethod
    async def text_completion(self, prompt: str, **kwargs) -> list[str]:
        raise NotImplementedError()

    async def get_tokens_count(self, text: str, **kwargs) -> int:
        raise NotImplementedError()



@dataclass
class LLMAPIClientConfig:
    api_key: str
    session: ClientSession
    base_url: str | None = None
    default_model: str | None = None
    headers: dict[str, Any] = field(default_factory=dict)


class BaseLLMAPIClient(BaseLLMClient, ABC):
    def __init__(self, config: LLMAPIClientConfig):
        ...

    @abstractmethod
    async def text_completion(self, prompt: str, model: str | None = None, **kwargs) -> list[str]:
        raise NotImplementedError()

    async def embedding(self, text: str, model: str | None = None, **kwargs) -> list[float]:
        raise NotImplementedError()

Requirements

Python 3.10+

Installation

If you are worried about the size of the package you can install only the clients you need, by default we install none of the clients.

For all current clients support

$ pip install llm-client[all]

For only the base interface and some light LLMs clients (AI21 and Aleph Alpha)

$ pip install llm-client

Optional Dependencies

For all current api clients support

$ pip install llm-client[api]

For only local client support

$ pip install llm-client[local]

For sync support

$ pip install llm-client[sync]

For only OpenAI support

$ pip install llm-client[openai]

For only HuggingFace support

$ pip install llm-client[huggingface]

Usage

Using OpenAI directly through OpenAIClient - Maximum control and best practice in production

import os
from aiohttp import ClientSession
from llm_client import ChatMessage, Role, OpenAIClient, LLMAPIClientConfig

OPENAI_API_KEY = os.environ["API_KEY"]
OPENAI_ORG_ID = os.getenv("ORG_ID")


async def main():
    async with ClientSession() as session:
        llm_client = OpenAIClient(LLMAPIClientConfig(OPENAI_API_KEY, session, default_model="text-davinci-003",
                                                     headers={"OpenAI-Organization": OPENAI_ORG_ID}))  # The headers are optional
        text = "This is indeed a test"

        print("number of tokens:", await llm_client.get_tokens_count(text))  # 5
        print("generated chat:", await llm_client.chat_completion(  
            messages=[ChatMessage(role=Role.USER, content="Hello!")], model="gpt-3.5-turbo"))  # ['Hi there! How can I assist you today?']
        print("generated text:", await llm_client.text_completion(text))  # [' string\n\nYes, this is a test string. Test strings are used to']
        print("generated embedding:", await llm_client.embedding(text))  # [0.0023064255, -0.009327292, ...]

Using LLMAPIClientFactory - Perfect if you want to move fast and to not handle the client session yourself

import os
from llm_client import LLMAPIClientFactory, LLMAPIClientType

OPENAI_API_KEY = os.environ["API_KEY"]


async def main():
    async with LLMAPIClientFactory() as llm_api_client_factory:
        llm_client = llm_api_client_factory.get_llm_api_client(LLMAPIClientType.OPEN_AI,
                                                               api_key=OPENAI_API_KEY)

        await llm_client.text_completion(prompt="This is indeed a test")
        await llm_client.text_completion(prompt="This is indeed a test", max_length=50)

        
# Or if you don't want to use async
from llm_client import SyncLLMAPIClientFactory

with SyncLLMAPIClientFactory() as llm_api_client_factory:
    llm_client = llm_api_client_factory.get_llm_api_client(LLMAPIClientType.OPEN_AI,
                                                           api_key=OPENAI_API_KEY)

    llm_client.text_completion(prompt="This is indeed a test")
    llm_client.text_completion(prompt="This is indeed a test", max_length=50)

Local model

import os
from transformers import AutoModelForCausalLM, AutoModelForSeq2SeqLM, AutoTokenizer
from llm_client import LocalClientConfig, LocalClient

async def main():
    try:
        model = AutoModelForCausalLM.from_pretrained(os.environ["MODEL_NAME_OR_PATH"])
    except ValueError:
        model = AutoModelForSeq2SeqLM.from_pretrained(os.environ["MODEL_NAME_OR_PATH"])
    tokenizer = AutoTokenizer.from_pretrained(os.environ["MODEL_NAME_OR_PATH"])
    llm_client = LocalClient(LocalClientConfig(model, tokenizer, os.environ["TENSORS_TYPE"], os.environ["DEVICE"]))

    await llm_client.text_completion(prompt="This is indeed a test")
    await llm_client.text_completion(prompt="This is indeed a test", max_length=50)


# Or if you don't want to use async
import async_to_sync

try:
    model = AutoModelForCausalLM.from_pretrained(os.environ["MODEL_NAME_OR_PATH"])
except ValueError:
    model = AutoModelForSeq2SeqLM.from_pretrained(os.environ["MODEL_NAME_OR_PATH"])
tokenizer = AutoTokenizer.from_pretrained(os.environ["MODEL_NAME_OR_PATH"])
llm_client = LocalClient(LocalClientConfig(model, tokenizer, os.environ["TENSORS_TYPE"], os.environ["DEVICE"]))

llm_client = async_to_sync.methods(llm_client)

llm_client.text_completion(prompt="This is indeed a test")
llm_client.text_completion(prompt="This is indeed a test", max_length=50)

Contributing

Contributions are welcome! Please check out the todos below, and feel free to open issue or a pull request.

Todo

The list is unordered

  • Add support for more LLMs
  • Add support for more functions via LLMs
    • embeddings
    • list models
    • edits
    • more
  • Add contributing guidelines
  • Create an easy way to run multiple LLMs in parallel with the same prompts
  • Convert common models parameter (e.g. temperature, max_tokens, etc.)

Development

To install the package in development mode, run the following command:

$ pip install -e ".[all,test]"

To run the tests, run the following command:

$ pytest tests

If you want to add a new LLMClient you need to implement BaseLLMClient or BaseLLMAPIClient.

If you are adding a BaseLLMAPIClient you also need to add him in LLMAPIClientFactory.

You can add dependencies to your LLMClient in pyproject.toml also make sure you are adding a matrix.flavor in test.yml.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_client-0.1.0.tar.gz (490.2 kB view details)

Uploaded Source

Built Distribution

llm_client-0.1.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file llm_client-0.1.0.tar.gz.

File metadata

  • Download URL: llm_client-0.1.0.tar.gz
  • Upload date:
  • Size: 490.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for llm_client-0.1.0.tar.gz
Algorithm Hash digest
SHA256 df622b7e3d5e83f9a5d376eaf96e127ad15a8fa08d5c365a331db4e506f8d7db
MD5 9cc0af796bceea7f60b443684f3fa3a6
BLAKE2b-256 7c93bc146203891dfc58601950a2888ca6f9608742b045259afbdd88b661b523

See more details on using hashes here.

File details

Details for the file llm_client-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llm_client-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for llm_client-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 63f385f2992ae6e711bbd2f46fbfa2a9b8bca01274194d77d45e2770dbcd1eb5
MD5 6efcc146f2d1749c1a1375df1f15fa95
BLAKE2b-256 c642351c5f085f3d2060e63273a207165770570033718d825f42fb0e14641904

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page