Skip to main content

Easy-to-use LLM API from a state-of-the-art provider and comparison

Project description

api4all

Easy-to-use LLM API from state-of-the-art providers and comparison.

Features

  • Easy-to-use: A simple and easy-to-use API for state-of-the-art language models from different providers but using in a same way.
  • Comparison: Compare the cost and performance of different providers and models. Let you choose the best provider and model for your use case.
  • Log: Log the response and cost of the request in a log file.
  • Providers: Support for all of providers both open-source and closed-source.
  • Result: See the actual time taken by the request, especially when you dont't trust the benchmark.

Installation

1. Install the package

pip3 install api4all

2. Optional - Create and activate a virtual environment

  • Unix / macOS
python3 -m venv venv
source venv/bin/activate
  • Windows
python3 -m venv venv
.\venv\Scripts\activate

Quick Start

1. Wrap the API keys in a .env file of the provider you want to test.

TOGETHER_API_KEY=xxx
OPENAI_API_KEY=xxx
MISTRAL_API_KEY=xxx
ANTHROPIC_API_KEY=xxx

or set the environment variable directly.

export TOGETHER_API_KEY=xxx
export OPENAI_API_KEY=xxx

2. Run the code

from api4all import EngineFactory

messages = [
    {"role": "system",
    "content": "You are a helpful assistent for the my Calculus class."},
    {"role": "user",
    "content": "What is the current status of the economy?"}
]


engine = EngineFactory.create_engine(provider="together", 
                                    model="google/gemma-7b-it", 
                                    messages=messages, 
                                    temperature=0.9, 
                                    max_tokens=1028, 
                                    )

response = engine.generate_response()

print(response)
  • There are some examples in the examples folder or Open In Colab to test the examples in Google Colab.

3. Check the log file for the response and the cost of the request.

Request ID - fa8cebd0-265a-44b2-95d7-6ff1588d2c87
	create at: 2024-03-15 16:38:18,129
	INFO - SUCCESS
	
    Response:
		I am not able to provide information about the current status of the economy, as I do not have access to real-time information. Therefore, I recommend checking a reliable source for the latest economic news and data.
	
    Cost: $0.0000154    # Cost of this provider for this request
    Provider: together  # Provider used for this request
    Execution-time: Execution time not provided by the provider
    Actual-time: 0.9448428153991699 # Actual time taken by the request
    Input-token: 33     # Number of tokens used for the input
    Output-token: 44    # Number of tokens used for the output

Providers and Models

Providers

Provider Free Credit Rate Limit API Key name Provider string name
Groq Unlimited 30 Requests / Minute GROQ_API_KEY "groq"
Anyscale $10 30 Requests / Second ANYSCALE_API_KEY "anyscale"
Together AI $25 1 Requests / Second TOGETHER_API_KEY "together"
Replicate Free to try 50 Requests / Second REPLICATE_API_KEY "replicate"
Fireworks $1 600 Requests / Minute FIREWORKS_API_KEY "fireworks"
Deepinfra Free to try 200 Concurrent request DEEPINFRA_API_KEY "deepinfra"
Lepton $10 10 Requests / Minute LEPTON_API_KEY "lepton"
------ ------ ------ ------ ------
Google AI (Vertex AI) Unlimited 60 Requests / Minute GOOGLE_API_KEY "google"
OpenAI 60 Requests / Minute OPENAI_API_KEY "openai"
Mistral AI Free to try 5 Requests / Second MISTRAL_API_KEY "mistral"
Anthropic Free to try 5 Requests / Minute ANTHROPIC_API_KEY "anthropic"
  • Free to try: Free to try, no credit card required but limited to a certain number of tokens.
  • Rate limit is based on the free plan of the provider. The actual rate limit may be different based on the plan you choose.

Open-source models

-- Mixtral-8x7b-Instruct-v0.1 Gemma 7B it Mistral-7B-Instruct-v0.1 LLaMA2-70b Mistral-7B-Instruct-v0.2 CodeLlama-70b-Instruct LLaMA3-8b-Instruct LLaMA3-80b
API string name "mistralai/Mixtral-8x7B-Instruct-v0.1" "google/gemma-7b-it" "mistralai/Mistral-7B-Instruct-v0.1" "meta/Llama-2-70b-chat" "mistralai/Mistral-7B-Instruct-v0.2" "meta/CodeLlama-2-70b-intruct" "meta/Llama-3-8b-Instruct" "meta/Llama-3-80b"
Context Length 32,768 8.192 4,096 4,096 32,768 16,384 8,192 8,192
Developer Mistral AI Google Mistral AI Meta Mistral AI Meta Meta Meta
Cost (Input - Output / MTokens) ----- ------ ------ ----- ------ ------ ------ ------
Groq $0-$0 $0-$0 $0-$0 $0-$0 $0-$0
Anyscale $0.5-$0.5 $0.15-$0.15 $0.05-$0.25 $1.0-$1.0 $1.0-$1.0 $0.15-$0.15 $1.0-$1.0
Together AI $0.6-$0.6 $0.2-$0.2 $0.2-$0.2 $0.9-$0.9 $0.05-$0.25 $0.9-$0.9 $0.2-$0.2 $0.9-$0.9
Replicate $0.3-$1 $0.05-$0.25 $0.65-$2.75 $0.2-$0.2 $0.65-$2.75 $0.05-$0.25 $0.65-$2.75
Fireworks $0.5-$0.5 $0.2-$0.2 $0.9-$0.9 $0.2-$0.2 $0.9-$0.9 $0.2-$0.2 $0.9-$0.9
Deepinfra $0.27-$0.27 $0.13-$0.13 $0.13-$0.13 $0.7-$0.9 $0.7-$0.9 $0.08-$0.08 $0.59-$0.79
Lepton $0.5-$0.5 $0.8-$0.8 $0.07-$0.07 $0.8-$0.8

Closed-source models

1. Mistral AI

Model Input Pricing ($/1M Tokens) Output Pricing ($/1M Tokens) Context Length API string name
Mistral-7B-Instruct-v0.1 $0.25 $0.25 8,192 "mistral/open-mistral-7b"
Mixtral-8x7b-Instruct-v0.1 $0.7 $0.7 8,192 "mistral/open-mixtral-8x7b"
Mixtral Small $2 $6 "mistral/mistral-small-latest"
Mixtral Medium $2.7 $8.1 "mistral/mistral-medium-latest"
Mixtral Large $8 $24 "mistral/mistral-large-latest"

2. OpenAI

Model Input Pricing ($/1M Tokens) Output Pricing ($/1M Tokens) Context Length API string name
GPT-3.5-0125 $0.5 $1.5 16,385 "openai/gpt-3.5-turbo-0125"
GPT-3.5 $0.5 $1.5 16,385 "openai/gpt-3.5-turbo"
GPT-4 $30 $60 8,192 "openai/gpt-4"
GPT-4 $60 $120 32,768 "openai/gpt-4-32k"

3. Anthropic

Model Input Pricing ($/1M Tokens) Output Pricing ($/1M Tokens) Context Length API string name
Claude 3 Opus $15 $75 200,000 "anthropic/claude-3-opus"
Claude 3 Sonnet $3 $15 200,000 "anthropic/claude-3-sonnet"
Claude 3 Haiku $0.25 $1.25 200,000 "anthropic/claude-3-haiku"
Claude 2.1 $8 $24 200,000 "anthropic/claude-2.1"
Claude 2.0 $8 $24 100,000 "anthropic/claude-2.0"
Claude 2.0 $0.8 $2.4 100,000 "anthropic/claude-instant-1.2"

4. Google

Model Input Pricing ($/1M Tokens) Output Pricing ($/1M Tokens) Context Length API string name
Google Gemini 1.0 Pro $0 $0 32,768 "google/gemini-1.0-pro"

Contributing

Welcome to contribute to the project. If you see any updated pricing, new models, new providers, or any other changes, feel free to open an issue or a pull request.

Problems from the providers and Solutions

Error with Gemini pro 1.0

ValueError: The `response.text` quick accessor only works when the response contains a valid `Part`, but none was returned. Check the `candidate.safety_ratings` to see if the response was blocked.

Solution: The output is larger than your maximum tokens. Increase the max_tokens.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

api4all-0.4.0.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

api4all-0.4.0-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file api4all-0.4.0.tar.gz.

File metadata

  • Download URL: api4all-0.4.0.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.1

File hashes

Hashes for api4all-0.4.0.tar.gz
Algorithm Hash digest
SHA256 a0573a70c553ac53b5bf4a0d028ddd8ebf4054c5b3f123e341a8a40a506f0f0c
MD5 ce9172f138df9f8021fdcfd3cea249fe
BLAKE2b-256 2b9c6aafee91da0eb5e042302cd668f0afafb7e68780bd5c5eee61c502a6a55e

See more details on using hashes here.

File details

Details for the file api4all-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: api4all-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.1

File hashes

Hashes for api4all-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9978575038fa64288f49d45668268655e5ec09eefb42fb7792f30f24d4a19dec
MD5 bab0efa6b64625c279dd3e78367c8094
BLAKE2b-256 2f8e3dbde2f88247d7a7afac93d9d9b5aa8683bea6b787ba4b3f23b4f1cb23d7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page