Parallel inference calls to LLM APIs using Polars dataframes
Project description
Polar Llama
Overview
Polar Llama is a Python library designed to enhance the efficiency of making parallel inference calls to the ChatGPT API using the Polars dataframe tool. This library enables users to manage multiple API requests simultaneously, significantly speeding up the process compared to serial request handling.
Key Features
- Parallel Inference: Send multiple inference requests in parallel to the ChatGPT API without waiting for each individual request to complete.
- Integration with Polars: Utilizes the Polars dataframe for organizing and handling requests, leveraging its efficient data processing capabilities.
- Easy to Use: Simplifies the process of sending queries and retrieving responses from the ChatGPT API through a clean and straightforward interface.
- Multi-Message Support: Create and process conversations with multiple messages in context, supporting complex multi-turn interactions.
- Multiple Provider Support: Works with OpenAI, Anthropic, Gemini, Groq, and AWS Bedrock models, giving you flexibility in your AI infrastructure.
Installation
To install Polar Llama, you can use pip:
pip install polar-llama
Alternatively, for development purposes, you can install from source:
maturin develop
Example Usage
Here's how you can use Polar Llama to send multiple inference requests in parallel:
import polars as pl
from polar_llama import string_to_message, inference_async, Provider
import dotenv
dotenv.load_dotenv()
# Example questions
questions = [
'What is the capital of France?',
'What is the difference between polars and pandas?'
]
# Creating a dataframe with questions
df = pl.DataFrame({'Questions': questions})
# Adding prompts to the dataframe
df = df.with_columns(
prompt=string_to_message("Questions", message_type='user')
)
# Sending parallel inference requests
df = df.with_columns(
answer=inference_async('prompt', provider = Provider.OPENAI, model = 'gpt-4o-mini')
)
Multi-Message Conversations
Polar Llama now supports multi-message conversations, allowing you to maintain context across multiple turns:
import polars as pl
from polar_llama import string_to_message, combine_messages, inference_messages
import dotenv
dotenv.load_dotenv()
# Create a dataframe with system prompts and user questions
df = pl.DataFrame({
"system_prompt": [
"You are a helpful assistant.",
"You are a math expert."
],
"user_question": [
"What's the weather like today?",
"Solve x^2 + 5x + 6 = 0"
]
})
# Convert to structured messages
df = df.with_columns([
pl.col("system_prompt").invoke("string_to_message", message_type="system").alias("system_message"),
pl.col("user_question").invoke("string_to_message", message_type="user").alias("user_message")
])
# Combine into conversations
df = df.with_columns(
pl.invoke("combine_messages", pl.col("system_message"), pl.col("user_message")).alias("conversation")
)
# Send to model and get responses
df = df.with_columns(
pl.col("conversation").invoke("inference_messages", provider="openai", model="gpt-4").alias("response")
)
AWS Bedrock Support
Polar Llama now supports AWS Bedrock models. To use Bedrock, ensure you have AWS credentials configured (via AWS CLI, environment variables, or IAM roles):
import polars as pl
from polar_llama import string_to_message, inference_async
import dotenv
dotenv.load_dotenv()
# Example questions
questions = [
'What is the capital of France?',
'Explain quantum computing in simple terms.'
]
# Creating a dataframe with questions
df = pl.DataFrame({'Questions': questions})
# Adding prompts to the dataframe
df = df.with_columns(
prompt=string_to_message("Questions", message_type='user')
)
# Using AWS Bedrock with Claude model
df = df.with_columns(
answer=inference_async('prompt', provider='bedrock', model='anthropic.claude-3-haiku-20240307-v1:0')
)
Benefits
- Speed: Processes multiple queries in parallel, drastically reducing the time required for bulk query handling.
- Scalability: Scales efficiently with the increase in number of queries, ideal for high-demand applications.
- Ease of Integration: Integrates seamlessly into existing Python projects that utilize Polars, making it easy to add parallel processing capabilities.
- Context Preservation: Maintain conversation context with multi-message support for more natural interactions.
- Provider Flexibility: Choose from multiple LLM providers based on your needs and access.
Contributing
We welcome contributions to Polar Llama! If you're interested in improving the library or adding new features, please feel free to fork the repository and submit a pull request.
License
Polar Llama is released under the MIT license. For more details, see the LICENSE file in the repository.
Roadmap
- Multi-Message Support: Support for multi-message conversations to maintain context.
- Multiple Provider Support: Support for different LLM providers (OpenAI, Anthropic, Gemini, Groq, AWS Bedrock).
- Function Calling: Add support for using the function calls and structured data outputs for inference requests.
- Streaming Responses: Support for streaming responses from LLM providers.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polar_llama-0.1.6.tar.gz.
File metadata
- Download URL: polar_llama-0.1.6.tar.gz
- Upload date:
- Size: 181.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
296968031123e166e075be0f55e41219f5cbf7944290808e057ba1e58056d1ec
|
|
| MD5 |
5dd2a37d1e76582c5bafc1e0a0acf074
|
|
| BLAKE2b-256 |
3663704a5d4eba44bcd72bce56a982fb1e5327ac5e256c531aaa6c7e5d1edced
|
File details
Details for the file polar_llama-0.1.6-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: polar_llama-0.1.6-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 8.3 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd583b8e12aa4f156931ee6f1546c676a0c78ebf6fa166f4813a5fead4588479
|
|
| MD5 |
7d92182ca4785c74593d28ecc5bb7238
|
|
| BLAKE2b-256 |
ffae602c5b9652963ddd7a98c48bdc29259b5d035b8872c595a0f6e22116c5ce
|
File details
Details for the file polar_llama-0.1.6-cp38-abi3-manylinux_2_39_x86_64.whl.
File metadata
- Download URL: polar_llama-0.1.6-cp38-abi3-manylinux_2_39_x86_64.whl
- Upload date:
- Size: 10.5 MB
- Tags: CPython 3.8+, manylinux: glibc 2.39+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c6be52ffff7d9d3d8c30b7b666c82fa53bdbf02681eca6d4f535c72bf860476
|
|
| MD5 |
c042ab4036025090e2281ec0d5311827
|
|
| BLAKE2b-256 |
10a1bbedc2caba5718dfbce5a536c59bd2130cc96f646558cf0e96f442d2b7a9
|
File details
Details for the file polar_llama-0.1.6-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: polar_llama-0.1.6-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 8.8 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ea72533dfbd4416bc73836c2b1da6a5123693345e558ee09f314ed902a92c5e
|
|
| MD5 |
b0777e018f0bd135049dbeaf70cb256a
|
|
| BLAKE2b-256 |
113cbc980bda30b3b291de469e7f899021709b48c9b8cc1e337c77068911c6ea
|
File details
Details for the file polar_llama-0.1.6-cp38-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: polar_llama-0.1.6-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 9.3 MB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e1433f76afbdd39c9f3c5049e9d90281a42b422c34b22b9da7bb01311c14f70
|
|
| MD5 |
fda788303d628fc1d3e1b88fa3394c06
|
|
| BLAKE2b-256 |
21d3267466b71495a0d41727f180633fc063e532d225c0382f3314938062be9e
|