Skip to main content

Minimal Python library to connect to LLMs (OpenAI, Anthropic, Google, Mistral, OpenRouter, Reka, Groq, Together, Ollama, AI21, Cohere, Aleph-Alpha, HuggingfaceHub), with a built-in model performance benchmark.

Project description

PyLLMs

PyPI version License: MIT Twitter

PyLLMs is a minimal Python library to connect to various Language Models (LLMs) with a built-in model performance benchmark.

Table of Contents

Features

  • Connect to top LLMs in a few lines of code
  • Response meta includes tokens processed, cost, and latency standardized across models
  • Multi-model support: Get completions from different models simultaneously
  • LLM benchmark: Evaluate models on quality, speed, and cost
  • Async and streaming support for compatible models

Installation

Install the package using pip:

pip install pyllms

Quick Start

import llms

model = llms.init('gpt-4o')
result = model.complete("What is 5+5?")

print(result.text)

Usage

Basic Usage

import llms

model = llms.init('gpt-4o')
result = model.complete(
    "What is the capital of the country where Mozart was born?",
    temperature=0.1,
    max_tokens=200
)

print(result.text)
print(result.meta)

Multi-model Usage

models = llms.init(model=['gpt-3.5-turbo', 'claude-instant-v1'])
result = models.complete('What is the capital of the country where Mozart was born?')

print(result.text)
print(result.meta)

Async Support

result = await model.acomplete("What is the capital of the country where Mozart was born?")

Streaming Support

model = llms.init('claude-v1')
result = model.complete_stream("Write an essay on the Civil War")
for chunk in result.stream:
   if chunk is not None:
      print(chunk, end='')

Chat History and System Message

history = []
history.append({"role": "user", "content": user_input})
history.append({"role": "assistant", "content": result.text})

model.complete(prompt=prompt, history=history)

# For OpenAI chat models
model.complete(prompt=prompt, system_message=system, history=history)

Other Methods

count = model.count_tokens('The quick brown fox jumped over the lazy dog')

Configuration

PyLLMs will attempt to read API keys and the default model from environment variables. You can set them like this:

export OPENAI_API_KEY="your_api_key_here"
export ANTHROPIC_API_KEY="your_api_key_here"
export AI21_API_KEY="your_api_key_here"
export COHERE_API_KEY="your_api_key_here"
export ALEPHALPHA_API_KEY="your_api_key_here"
export HUGGINFACEHUB_API_KEY="your_api_key_here"
export GOOGLE_API_KEY="your_api_key_here"
export MISTRAL_API_KEY="your_api_key_here"
export REKA_API_KEY="your_api_key_here"
export TOGETHER_API_KEY="your_api_key_here"
export GROQ_API_KEY="your_api_key_here"
export DEEPSEEK_API_KEY="your_api_key_here"

export LLMS_DEFAULT_MODEL="gpt-3.5-turbo"

Alternatively, you can pass initialization values to the init() method:

model = llms.init(openai_api_key='your_api_key_here', model='gpt-4')

Model Benchmarks

PyLLMs includes an automated benchmark system. The quality of models is evaluated using a powerful model (e.g., GPT-4) on a range of predefined questions, or you can supply your own.

model = llms.init(model=['claude-3-haiku-20240307', 'gpt-4o-mini', 'claude-3-5-sonnet-20240620', 'gpt-4o', 'mistral-large-latest', 'open-mistral-nemo', 'gpt-4', 'gpt-3.5-turbo', 'deepseek-coder', 'deepseek-chat', 'llama-3.1-8b-instant', 'llama-3.1-70b-versatile'])

gpt4 = llms.init('gpt-4o')

models.benchmark(evaluator=gpt4)

Check Kagi LLM Benchmarking Project for the latest benchmarks!

To evaluate models on your own prompts:

models.benchmark(prompts=[("What is the capital of Finland?", "Helsinki")], evaluator=gpt4)

Supported Models

To get a full list of supported models:

model = llms.init()
model.list() # list all models

model.list("gpt")  # lists only models with 'gpt' in name/provider name

Currently supported models (may be outdated):

Provider Models
OpenAIProvider gpt-3.5-turbo, gpt-3.5-turbo-1106, gpt-3.5-turbo-instruct, gpt-4, gpt-4-1106-preview, gpt-4-turbo-preview, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-4o-2024-08-06, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4.5-preview, chatgpt-4o-latest, o1-preview, o1-mini, o1, o1-pro, o3-mini, o3, o3-pro, o4-mini
AnthropicProvider claude-2.1, claude-3-5-sonnet-20240620, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, claude-3-7-sonnet-20250219, claude-sonnet-4-20250514, claude-opus-4-20250514
BedrockAnthropicProvider anthropic.claude-instant-v1, anthropic.claude-v1, anthropic.claude-v2, anthropic.claude-3-haiku-20240307-v1:0, anthropic.claude-3-sonnet-20240229-v1:0, anthropic.claude-3-5-sonnet-20240620-v1:0
AI21Provider j2-grande-instruct, j2-jumbo-instruct
CohereProvider command, command-nightly
AlephAlphaProvider luminous-base, luminous-extended, luminous-supreme, luminous-supreme-control
HuggingfaceHubProvider hf_pythia, hf_falcon40b, hf_falcon7b, hf_mptinstruct, hf_mptchat, hf_llava, hf_dolly, hf_vicuna
GoogleGenAIProvider gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite-preview-06-17, gemini-2.0-flash, gemini-2.0-flash-lite, gemini-1.5-pro, gemini-1.5-flash, gemini-1.5-flash-8b
GoogleVertexAIProvider gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite-preview-06-17, gemini-2.0-flash, gemini-2.0-flash-lite, gemini-1.5-pro, gemini-1.5-flash, gemini-1.5-flash-8b
OllamaProvider vanilj/Phi-4:latest, falcon3:10b, smollm2:latest, llama3.2:3b-instruct-q8_0, qwen2:1.5b, mistral:7b-instruct-v0.2-q4_K_S, phi3:latest, phi3:3.8b, phi:latest, tinyllama:latest, magicoder:latest, deepseek-coder:6.7b, deepseek-coder:latest, dolphin-phi:latest, stablelm-zephyr:latest
DeepSeekProvider deepseek-chat, deepseek-coder
GroqProvider llama-3.1-405b-reasoning, llama-3.1-70b-versatile, llama-3.1-8b-instant, gemma2-9b-it
RekaProvider reka-edge, reka-flash, reka-core
TogetherProvider meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
OpenRouterProvider nvidia/llama-3.1-nemotron-70b-instruct, x-ai/grok-2, nousresearch/hermes-3-llama-3.1-405b:free, google/gemini-flash-1.5-exp, liquid/lfm-40b, mistralai/ministral-8b, qwen/qwen-2.5-72b-instruct
MistralProvider mistral-tiny, open-mistral-7b, mistral-small, open-mixtral-8x7b, mistral-small-latest, mistral-medium-latest, mistral-large-latest, open-mistral-nemo

Advanced Usage

Using OpenAI API on Azure

import llms
AZURE_API_BASE = "{insert here}"
AZURE_API_KEY = "{insert here}"

model = llms.init('gpt-4')

azure_args = {
    "engine": "gpt-4",  # Azure deployment_id
    "api_base": AZURE_API_BASE,
    "api_type": "azure",
    "api_version": "2023-05-15",
    "api_key": AZURE_API_KEY,
}

azure_result = model.complete("What is 5+5?", **azure_args)

Using Google AI Models

PyLLMs supports Google's AI models through two providers:

Option 1: Gemini API (GoogleGenAI)

Uses direct Gemini API with API key authentication:

# Set your API key
export GOOGLE_API_KEY="your_api_key_here"

# Use any Gemini model
model = llms.init('gemini-2.5-flash')
result = model.complete("Hello!")

Option 2: Vertex AI (GoogleVertexAI)

Uses Google Cloud Vertex AI with Application Default Credentials:

  1. Set up a GCP account and create a project
  2. Enable Vertex AI APIs in your GCP project
  3. Install gcloud CLI tool
  4. Set up Application Default Credentials:
    gcloud auth application-default login
    gcloud config set project YOUR_PROJECT_ID
    

Then use models through Vertex AI:

# Option A: Direct provider usage for Vertex AI
from llms.providers.google_genai import GoogleVertexAIProvider
provider = GoogleVertexAIProvider()
result = provider.complete("Hello!")

# Option B: Unified provider with Vertex AI flag
from llms.providers.google_genai import GoogleGenAIProvider
provider = GoogleGenAIProvider(use_vertexai=True)
result = provider.complete("Hello!")

Note: Both providers support the same model names. If both GOOGLE_API_KEY and gcloud credentials are configured, llms.init('gemini-2.5-flash') will use both providers simultaneously.

Using Local Ollama LLM models

  1. Ensure Ollama is running and you've pulled the desired model
  2. Get the name of the LLM you want to use
  3. Initialize PyLLMs:
model = llms.init("tinyllama:latest")
result = model.complete("Hello!")

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyllms-0.7.6.tar.gz (385.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyllms-0.7.6-py3-none-any.whl (56.9 kB view details)

Uploaded Python 3

File details

Details for the file pyllms-0.7.6.tar.gz.

File metadata

  • Download URL: pyllms-0.7.6.tar.gz
  • Upload date:
  • Size: 385.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for pyllms-0.7.6.tar.gz
Algorithm Hash digest
SHA256 58291763803418726bea49c07da734391f7086bbf35237bf908d71dbce6a77be
MD5 f3303831aefb69d09ec3083631961fb0
BLAKE2b-256 8c82117c1c1244590abe2bb1a7dc16038a407e6172fb2caa653b77e5fdfd725e

See more details on using hashes here.

File details

Details for the file pyllms-0.7.6-py3-none-any.whl.

File metadata

  • Download URL: pyllms-0.7.6-py3-none-any.whl
  • Upload date:
  • Size: 56.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for pyllms-0.7.6-py3-none-any.whl
Algorithm Hash digest
SHA256 e2418d286b42f376b6fc25921678763571f2c52fd878deb533c03994736b06ba
MD5 1d57a0e9f3572ca7180f5a305e636d20
BLAKE2b-256 85f3b2966e2afc0312c5f889257e4740787cd5767f675aad98c5d3071466df65

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page