One master key for all LLM/GenAI endpoints
Project description
LlaMaKey: one master key for accessing all cloud LLM/GenAI APIs
Introducing LlaMa(ster)Key, the simplified and secure way to manage API keys and control the access to various cloud LLM/GenAI APIs for multiple users. LlaMaKey enables a user to access multiple cloud AI APIs through a single, user-unique master key, instead of a bunch of API keys, one for each platform. As a proxy, it eases key management for both the user and the administrator by consolidating the keys distributed to a user to just one while enhancing the protection of the actual API keys by hiding them from the user. Major cloud AI APIs (OpenAI, Cohere, AnyScale, etc.) can be seamlessly called in their official Python SDKs without any code changes. In addition, administrators can control individual users in detail through rate throttling, API/endpoint whitelisting, budget capping, etc. LlaMaKey is open source under MIT license and is ready for private, on-premises deployment.
graph TD
subgraph Your team
A[User A] -- Local pass A --> L[LlaMasterKey server]
B[User B] -- Local pass B --> L["LlaMasterKey server<br> (rate throttling, API/endpoint whitelisting, <br> logging, budgetting, etc.)"]
end
L -- Actual OPENAI_API_KEY--> O[OpenAI API server]
L -- Actual COHERE_API_KEY--> C[Cohere API server]
L -- Actual VECTARA_API_KEY--> V[Vectara API server]
The pain and the solution
How do you manage the API keys in a team needing to access an array of cloud LLM/GenAI APIs? If you get one key per user per API, then you have too many keys to manage. But if you share the key per API, then it is too risky. What if your careless intern accidentally pushes it to a public Github repo?
This is when LlamaKey comes to play. It is a proxy between your users and the actual cloud AI API. To authenticate, only one key is needed between your team member's code and your LlamaKey server. If any of them makes you unhappy, just revoke one key to cut him/her loss without interrupting others.
A user does not need to change a single line of code to use LlaMaKey. LlaMaKey takes advantage of a feature in the official Python SDKs of most cloud LLM/GenAI APIs that each of them has a BASE_URL
which is configurable in the environment variables:
OPENAI_BASE_URL
for OpenAICO_API_URL
for CohereANYSCALE_BASE_URL
for AnyScale
So the user only needs to set the respectively BASE_URL
to the LlaMaKey server. Then the request is first make to a LlaMaKey server, which then forwards it to the real cloud LLM/GenAI endpoint.
Roadmap
-
Currently, authentication with the LlaMaKey server is not enabled. If you want us to support it, please open an issue on Github. We will see it as a demand and prioritize it accordingly.
-
Supported APIs:
- OpenAI (all endpoints)
- Cohere (all endpoints)
- AnyScale
- HuggingFace Inference API (free tier)
- HuggingFace EndPoint API
- Anthropic
- Google Vertex AI
- Vectara AI
Installation
Stable version:
pip install LLaMasterKey
Nightly version:
You can manually install the nightly version at:
https://github.com/TexteaInc/LlaMasterKey/releases/tag/nightly
Build from source
Requirements:
- git
- Rust Toolchains: https://www.rust-lang.org/tools/install
git clone git@github.com:TexteaInc/LlaMasterKey.git
# you can switch to a different branch:
# git switch dev
cargo build --release
# available at ./target/release/lmk
Usage
On the server end, set up the actual API keys in the environment variable per their respective APIs and start your LlaMaKey server, for example:
export OPENAI_API_KEY=sk-xxx # an actual openai key
lmk # start the server
The server will read keys of supported LLM/GenAI APIs from the OS environment variables and start a server at http://localhost:8000
(8000 is the default port of FastAPI). It will generate the shell command to activate certain environment variables on your client end, like this:
export OPENAI_BASE_URL="http://127.0.0.1:8000/openai" # direct OpenAI calls to the LlaMaKey server
export OPENAI_API_KEY="LlaMaKey" # a placeholder master key
For your convenience, the commands are also dumped to the file./llamakey_local.env
.
On the client end, activate the environment variables generated above before running your code. You can copy and paste the commands above or simply source the llamakey_local.env
file generated in the previous step, for example:
# step 1: activate the environment variables that directs the API calls to the LlaMaKey server
source llamakey_local.env # this is only one of many ways to do it.
# Step 2: Call OpenAI as usual using its offical Python SDK
python3 -c '\
from openai import OpenAI;
client = OpenAI();
print (\
client.chat.completions.create(\
model="gpt-3.5-turbo",\
messages=[{"role": "user", "content": "What is FastAPI?"}]
)
)'
License
Ah, this is important. Let's say MIT for now?
Contact
For usage, bugs, or feature requests, please open an issue on Github. For private inquiries, please email hello@funix.io
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for LlaMasterKey-0.1.1-py3-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75b849cd22b1bba99166fcc8e3d9d70e9e62fbbe1d8ad259429e2b2b7e276914 |
|
MD5 | db018a9df65c8bd70278d45c84eb6b87 |
|
BLAKE2b-256 | a16fa61a52e2eaa57c73d502a7681544934799c6a642712ba9ae24fac094b841 |
Hashes for LlaMasterKey-0.1.1-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc84d8af0f793058ebd90a24463016e2c2f9fdb63ddf6487b80c25fb19d00fae |
|
MD5 | 6b12835b986827a60b18d494cade6682 |
|
BLAKE2b-256 | 24f743d74f9b8d4869180089ad412e768a95dcb44d3d9c5ce9fae289e8d5c5df |
Hashes for LlaMasterKey-0.1.1-py3-none-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ecac5f15ed536ba8f1e9842e365fa5e215b2eb2dac36f4e6e336171c1983335 |
|
MD5 | 77590c88c462ce8ec5b7479a286b0fbe |
|
BLAKE2b-256 | 0ecd27fb828bcf385ddeb6a166f31f131196fae9dc624b040cdd624919f8ef20 |