use any llm api in a plug-and-play fashion
Project description
infinity-llm: Use any LLM API
infinity-llm is a collection of Python tools to make LLM APIs plug-and-play for fast, easy experimentation. I use this in my own projects today, but it's still very much a work in progress. I'm trying to strike the balance between simplicity and modularity, so if you have ideas feel free to message me or submit a PR!
Key Features
- Chat Completion: Mostly a wrapper around jxnl/instructor, supports a/sync chat completion and streaming for structured and unstructured chat completions.
- Embeddings/Rerankers: Easily use a slew of embedding and reranking models
- Asynchronous Workloads: Run all chat completion and embedding workloads in massively parallel fashion without worrying about rate limits. Nice for ETL pipelines.
- OpenAI Batch Jobs: Run large scale batch jobs with OpenAI's batch API.
Chat Completion
All types of chat completions are made easy!
- Make a client
from infinity_llm import from_any, Provider
client = from_any(
provider=Provider.OPENAI,
model_name="gpt-4o",
async_client=False
)
- Choose between a/sync un/structured response with/without streaming
# synchronous, unstructured response without streaming
# (aka a standard chat completion
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a friend."},
{"role": "user", "content": "Tell me about your day."}
],
response_model=None # No response model means unstructured response
)
Embeddings/Rerankers
from infinity_llm import Provider, embed_from_any
# Create a Cohere embedding client
client = embed_from_any(Provider.COHERE)
# Example text to embed
text = "This is an example sentence to embed using Cohere."
# Get the embedding
embeddings, total_tokens = client.create(input=text, model="embed-english-v3.0", input_type="clustering")
print(f"Number of embeddings: {len(embeddings)}")
# > Number of embeddings: 1
print(f"Embedding dimension: {len(embeddings[0])}")
# > Embedding dimension: 1536
print(f"Usage: {total_tokens}")
# > Usage: 13
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
infinity_llm-0.1.0.tar.gz
(20.2 kB
view details)
Built Distribution
File details
Details for the file infinity_llm-0.1.0.tar.gz
.
File metadata
- Download URL: infinity_llm-0.1.0.tar.gz
- Upload date:
- Size: 20.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 560f920298c775b63bb422706f5f4d970185ebb874878688927d4af07fed8bc0 |
|
MD5 | 12d704ef68cd084b8485ad85d5619cf0 |
|
BLAKE2b-256 | f4d18ba3ca076e512324e662de6927938740f2a85c9f41d2f2b90ae2f66fa88f |
File details
Details for the file infinity_llm-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: infinity_llm-0.1.0-py3-none-any.whl
- Upload date:
- Size: 28.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c75616ed42b244098e1f59c3bbb171bf221465c5e8de1efb97a6e3eaeebbeabc |
|
MD5 | e4945ab7b7a7b3dc489c1cc61c0af9a6 |
|
BLAKE2b-256 | 9a7e8c3020ef32ff9f2a56dc3d9ce3069b700ebfee7b6852b83004524f967005 |