Skip to main content

A unified API for calling multiple LLM providers through a consistent, OpenAI-compatible interface

Project description

Model Calling

PyPI version License: MIT

A unified API for calling multiple LLM providers through a consistent, OpenAI-compatible interface.

Key Features

  • 🔄 OpenAI-compatible API: Uses the familiar chat completions format
  • ☎️ Multiple Backends: Support for Ollama, vLLM, OpenAI, Anthropic, Cohere, and more
  • 🛠️ Function Calling: Unified support for tools/function calling across models
  • 📊 Streaming Support: Efficient streaming for all supported models
  • 🔧 Runtime Configuration: Adjust model settings without restarting
  • 📦 Importable Library: Can be used as a service or imported library

Installation

pip install model-calling

Quick Example

from model_calling.client import SyncModelCallingClient

client = SyncModelCallingClient()

try:
    response = client.chat_completion(
        model="ollama/mistral-small3.1:24b",  # Use any model from any provider
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is machine learning?"}
        ]
    )
    
    print(response["choices"][0]["message"]["content"])
finally:
    client.close()

Supported Providers

Provider Prefix Example Models
Ollama (local) ollama/ mistral-small3.1, llama3, qwen
vLLM (cluster) vllm/ Any model deployed with vLLM
OpenAI openai/ gpt-4, gpt-3.5-turbo
Anthropic anthropic/ claude-3-opus, claude-3-sonnet
Cohere cohere/ command, command-r

Function Calling

Model Calling provides a consistent interface for function calling (tools) across all supported providers:

import json
from model_calling.client import SyncModelCallingClient

client = SyncModelCallingClient()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    }
                },
                "required": ["location"]
            }
        }
    }
]

try:
    # Initial request with tools
    response = client.chat_completion(
        model="ollama/mistral-small3.1:24b",
        messages=[
            {"role": "user", "content": "What's the weather like in Paris?"}
        ],
        tools=tools
    )
    
    # Check if function call was requested
    message = response["choices"][0]["message"]
    if "function_call" in message:
        function_name = message["function_call"]["name"]
        arguments = json.loads(message["function_call"]["arguments"])
        
        # Call your function with the arguments
        weather_data = get_weather(arguments["location"])
        
        # Continue the conversation with the function result
        final_response = client.chat_completion(
            model="ollama/mistral-small3.1:24b",
            messages=[
                {"role": "user", "content": "What's the weather like in Paris?"},
                {
                    "role": "assistant", 
                    "content": "", 
                    "function_call": {
                        "name": "get_weather",
                        "arguments": json.dumps({"location": "Paris, France"})
                    }
                },
                {
                    "role": "function", 
                    "name": "get_weather", 
                    "content": json.dumps(weather_data)
                }
            ]
        )
        
        print(final_response["choices"][0]["message"]["content"])
    else:
        print(message["content"])
finally:
    client.close()

Using as a Service

Model Calling can be run as a service to provide a unified API for all your applications:

# Start the service
python -m model_calling

Then make API calls to the service:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ollama/mistral-small3.1:24b",
    "messages": [
      {"role": "user", "content": "What is machine learning?"}
    ]
  }'

Using Hosted Providers

To use hosted providers like OpenAI and Anthropic, set your API keys in environment variables or a .env file:

# Create a .env file with your API keys
cp .env.example .env
# Edit .env with your API keys

Then you can use the hosted models:

from model_calling.client import SyncModelCallingClient

client = SyncModelCallingClient()

try:
    # OpenAI
    response = client.chat_completion(
        model="openai/gpt-4",
        messages=[
            {"role": "user", "content": "What is quantum computing?"}
        ]
    )
    
    # Anthropic
    response = client.chat_completion(
        model="anthropic/claude-3-sonnet-20240229",
        messages=[
            {"role": "user", "content": "What is quantum computing?"}
        ]
    )
finally:
    client.close()

Documentation

For complete documentation, visit the Model Calling Documentation.

Examples

Check out the examples directory for more examples of how to use Model Calling, including:

  • Basic chat completions
  • Function calling
  • Streaming responses
  • Building agents
  • Working with different providers

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

model_calling-0.1.0.tar.gz (44.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

model_calling-0.1.0-py3-none-any.whl (52.2 kB view details)

Uploaded Python 3

File details

Details for the file model_calling-0.1.0.tar.gz.

File metadata

  • Download URL: model_calling-0.1.0.tar.gz
  • Upload date:
  • Size: 44.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.12

File hashes

Hashes for model_calling-0.1.0.tar.gz
Algorithm Hash digest
SHA256 17a07816a5527365008c952fa0177efed543168346de7377b7688f1772f20b8e
MD5 6d32324bf7b6d670e4d1fa40ef6228b6
BLAKE2b-256 5c56fa400b35a3b1ac200425e28eab1f1049b26156736b853314fbb821a8ccda

See more details on using hashes here.

File details

Details for the file model_calling-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: model_calling-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 52.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.12

File hashes

Hashes for model_calling-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0838c0152a18b9bf05eee9e2d52fc55445273d075ccf3bd4dcbb44803437ce08
MD5 bd4853e8c6854e1d08b91ac8f7cdd4a8
BLAKE2b-256 54e3410a457fcd3a89a9cb9f072195fff6225f151fdbea0f7538b3a229da8ae2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page