Easy-to-use LLM API from a state-of-the-art provider and comparison
Project description
api4all
Easy-to-use LLM API from a state-of-the-art provider and comparison.
Features
- Easy-to-use: A simple and easy-to-use API for state-of-the-art language models from different providers but using in a same way.
- Comparison: Compare the cost and performance of different providers and models. Let you choose the best provider and model for your use case.
- Log: Log the response and cost of the request in a log file.
- Providers: Support for all of providers both open-source and closed-source.
- Result: See the actual time taken by the request, especially when you dont't trust the benchmark.
Installation
1. Install the package
pip3 install api4all
2. Optional - Create and activate a virtual environment
- Unix / macOS
python3 -m venv venv
source venv/bin/activate
- Windows
python3 -m venv venv
.\venv\Scripts\activate
Quick Start
1. Wrap the API keys in a .env
file of the provider you want to test.
TOGETHER_API_KEY=xxx
OPENAI_API_KEY=xxx
MISTRAL_API_KEY=xxx
ANTHROPIC_API_KEY=xxx
or set the environment variable directly.
export TOGETHER_API_KEY=xxx
export OPENAI_API_KEY=xxx
2. Run the code
from api4all import EngineFactory
messages = [
{"role": "system",
"content": "You are a helpful assistent for the my Calculus class."},
{"role": "user",
"content": "What is the current status of the economy?"}
]
engine = EngineFactory.create_engine(provider="together",
model="google/gemma-7b-it",
messages=messages,
temperature=0.9,
max_tokens=1028,
)
response = engine.generate_response()
print(response)
- There are some examples in the examples folder or to test the examples in Google Colab.
3. Check the log file for the response and the cost of the request.
Request ID - fa8cebd0-265a-44b2-95d7-6ff1588d2c87
create at: 2024-03-15 16:38:18,129
INFO - SUCCESS
Response:
I am not able to provide information about the current status of the economy, as I do not have access to real-time information. Therefore, I recommend checking a reliable source for the latest economic news and data.
Cost: $0.0000154 # Cost of this provider for this request
Provider: together # Provider used for this request
Execution-time: Execution time not provided by the provider
Actual-time: 0.9448428153991699 # Actual time taken by the request
Input-token: 33 # Number of tokens used for the input
Output-token: 44 # Number of tokens used for the output
Providers and Models
Providers
Provider | Free Credit | Rate Limit | API Key name | Provider string name |
---|---|---|---|---|
Groq | Unlimited | 30 Requests / Minute | GROQ_API_KEY | "groq" |
Anyscale | $10 | 30 Requests / Second | ANYSCALE_API_KEY | "anyscale" |
Together AI | $25 | 1 Requests / Second | TOGETHER_API_KEY | "together" |
Replicate | Free to try | 50 Requests / Second | REPLICATE_API_KEY | "replicate" |
Fireworks | $1 | 600 Requests / Minute | FIREWORKS_API_KEY | "fireworks" |
Deepinfra | Free to try | 200 Concurrent request | DEEPINFRA_API_KEY | "deepinfra" |
Lepton | $10 | 10 Requests / Minute | LEPTON_API_KEY | "lepton" |
------ | ------ | ------ | ------ | ------ |
Google AI (Vertex AI) | Unlimited | 60 Requests / Minute | GOOGLE_API_KEY | "google" |
OpenAI | ✕ | 60 Requests / Minute | OPENAI_API_KEY | "openai" |
Mistral AI | Free to try | 5 Requests / Second | MISTRAL_API_KEY | "mistral" |
Anthropic | Free to try | 5 Requests / Minute | ANTHROPIC_API_KEY | "anthropic" |
- Free to try: Free to try, no credit card required but limited to a certain number of tokens.
- Rate limit is based on the free plan of the provider. The actual rate limit may be different based on the plan you choose.
Open-source models
-- | Mixtral-8x7b-Instruct-v0.1 | Gemma 7B it | Mistral-7B-Instruct-v0.1 | LLaMA2-70b | Mistral-7B-Instruct-v0.2 | CodeLlama-70b-Instruct |
---|---|---|---|---|---|---|
API string name | "mistralai/Mixtral-8x7B-Instruct-v0.1" | "google/gemma-7b-it" | "mistralai/Mistral-7B-Instruct-v0.1" | "meta/Llama-2-70b-chat" | "mistralai/Mistral-7B-Instruct-v0.2" | "meta/CodeLlama-2-70b-intruct" |
Context Length | 32,768 | 8.192 | 4,096 | 4,096 | 32,768 | |
Developer | Mistral AI | Mistral AI | Meta | |||
Cost (Input - Output / MTokens) | ----- | ------ | ------ | ----- | ||
Groq | $0-$0 | $0-$0 | ✕ | $0-$0 | ✕ | ✕ |
Anyscale | $0.5-$0.5 | $0.15-$0.15 | $0.05-$0.25 | $1.0-$1.0 | ✕ | $1.0-$1.0 |
Together AI | $0.6-$0.6 | $0.2-$0.2 | $0.2-$0.2 | $0.9-$0.9 | $0.05-$0.25 | $0.9-$0.9 |
Replicate | $0.3-$1 | ✕ | $0.05-$0.25 | $0.65-$2.75 | $0.2-$0.2 | $0.65-$2.75 |
Fireworks | $0.5-$0.5 | ✕ | $0.2-$0.2 | $0.9-$0.9 | $0.2-$0.2 | $0.9-$0.9 |
Deepinfra | $0.27-$0.27 | $0.13-$0.13 | $0.13-$0.13 | $0.7-$0.9 | ✕ | $0.7-$0.9 |
Lepton | $0.5-$0.5 | ✕ | ✕ | $0.8-$0.8 | ✕ | ✕ |
Closed-source models
1. Mistral AI
Model | Input Pricing ($/1M Tokens) | Output Pricing ($/1M Tokens) | Context Length | API string name |
---|---|---|---|---|
Mistral-7B-Instruct-v0.1 | $0.25 | $0.25 | 8,192 | "mistral/open-mistral-7b" |
Mixtral-8x7b-Instruct-v0.1 | $0.7 | $0.7 | 8,192 | "mistral/open-mixtral-8x7b" |
Mixtral Small | $2 | $6 | ✕ | "mistral/mistral-small-latest" |
Mixtral Medium | $2.7 | $8.1 | ✕ | "mistral/mistral-medium-latest" |
Mixtral Large | $8 | $24 | ✕ | "mistral/mistral-large-latest" |
2. OpenAI
Model | Input Pricing ($/1M Tokens) | Output Pricing ($/1M Tokens) | Context Length | API string name |
---|---|---|---|---|
GPT-3.5-0125 | $0.5 | $1.5 | 16,385 | "openai/gpt-3.5-turbo-0125" |
GPT-3.5 | $0.5 | $1.5 | 16,385 | "openai/gpt-3.5-turbo" |
GPT-4 | $30 | $60 | 8,192 | "openai/gpt-4" |
GPT-4 | $60 | $120 | 32,768 | "openai/gpt-4-32k" |
3. Anthropic
Model | Input Pricing ($/1M Tokens) | Output Pricing ($/1M Tokens) | Context Length | API string name |
---|---|---|---|---|
Claude 3 Opus | $15 | $75 | 200,000 | "anthropic/claude-3-opus" |
Claude 3 Sonnet | $3 | $15 | 200,000 | "anthropic/claude-3-sonnet" |
Claude 3 Haiku | $0.25 | $1.25 | 200,000 | "anthropic/claude-3-haiku" |
Claude 2.1 | $8 | $24 | 200,000 | "anthropic/claude-2.1" |
Claude 2.0 | $8 | $24 | 100,000 | "anthropic/claude-2.0" |
Claude 2.0 | $0.8 | $2.4 | 100,000 | "anthropic/claude-instant-1.2" |
4. Google
Model | Input Pricing ($/1M Tokens) | Output Pricing ($/1M Tokens) | Context Length | API string name |
---|---|---|---|---|
Google Gemini 1.0 Pro | $0 | $0 | 32,768 | "google/gemini-1.0-pro" |
Contributing
Welcome to contribute to the project. If you see any updated pricing, new models, new providers, or any other changes, feel free to open an issue or a pull request.
Problems from the providers and Solutions
Error with Gemini pro 1.0
ValueError: The `response.text` quick accessor only works when the response contains a valid `Part`, but none was returned. Check the `candidate.safety_ratings` to see if the response was blocked.
Solution: The output is larger than your maximum tokens. Increase the max_tokens
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
api4all-0.3.1.tar.gz
(18.2 kB
view details)
Built Distribution
api4all-0.3.1-py3-none-any.whl
(18.6 kB
view details)
File details
Details for the file api4all-0.3.1.tar.gz
.
File metadata
- Download URL: api4all-0.3.1.tar.gz
- Upload date:
- Size: 18.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 922a05c8a2ee10f90742a1e15f789b5be56f53dd63beef6e6ebe6bfffd557524 |
|
MD5 | 07049ddb77455e6ff9112750db2b40e8 |
|
BLAKE2b-256 | 39ad410d4b8935a2e1c8f4bfe4a3652d76290edadd6f481957d65a288700f122 |
File details
Details for the file api4all-0.3.1-py3-none-any.whl
.
File metadata
- Download URL: api4all-0.3.1-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3af1612ee80a04eb4c39c704fd7a0aff5baac8cf313b9567f1b892969ef32c39 |
|
MD5 | 022308a28e88e19b49daa9b9c6401426 |
|
BLAKE2b-256 | b19f12c1657200f3dc74d8bed7bb0a59c7c703e9e02cb31c74a0f75b40cc41bb |