A chat completion client extension using Llama CPP, integrating with AutoGen for AI-powered chat interactions.
Project description
AutoGen Llama-CPP Chat Completion Extension
This extension provides a Chat Completion Client using the Llama-CPP model. It integrates with the AutoGen ecosystem, enabling AI-powered chat completion with access to external tools.
Installation
To install the autogen-llama-cpp-chat-completion extension, run the following command:
pip install autogen-llama-cpp-chat-completion
Dependencies
autogen-core>=0.4,<0.5pydanticllama-cpp
Usage
Once installed, you can integrate this extension into your AutoGen system for chat-based completions using the Llama-CPP model.
Example Usage
Here’s an example of how you can use the extension to create a chat session with Llama-CPP:
from autogen_llama_cpp_chat_completion.llama_cpp_extension import LlamaCppChatCompletionClient
from autogen_core.models import SystemMessage, UserMessage
# Initialize the LlamaCpp client
client = LlamaCppChatCompletionClient(
repo_id="your_repo_id",
filename="path_to_model_file",
n_gpu_layers=-1,
seed=1337,
n_ctx=1000,
verbose=True
)
# Create chat messages
messages = [
SystemMessage(content="You are a helpful assistant."),
UserMessage(content="What is the capital of France?")
]
# Get a response from the model
result = await client.create(messages)
# Print the result
print(result.content)
Phi-4 Model Example
You can also use the phi-4 model for chat completions. Here's an example of how to integrate it with the extension:
from autogen_llama_cpp_chat_completion.llama_cpp_extension import LlamaCppChatCompletionClient
from autogen_core.models import SystemMessage, UserMessage
# Initialize the Phi-4 LlamaCpp client
model_client = LlamaCppChatCompletionClient(
repo_id="unsloth/phi-4-GGUF",
filename="phi-4-Q2_K_L.gguf",
n_gpu_layers=-1,
seed=1337,
n_ctx=16384,
verbose=False,
)
# Create chat messages
messages = [
SystemMessage(content="You are an assistant with the Phi-4 model."),
UserMessage(content="What is the latest breakthrough in AI research?")
]
# Get a response from the model
result = await model_client.create(messages)
# Print the result
print(result.content)
This example demonstrates how to use the phi-4 model, providing a larger context window (n_ctx=16384) and using the model from the "unsloth/phi-4-GGUF" repository.
Streaming Mode
You can also use streaming mode to generate responses incrementally:
response_generator = client.create_stream(messages)
# Iterate through the response stream
async for token in response_generator:
print(token)
Configuration
When initializing the LlamaCppChatCompletionClient, you can provide the following configuration parameters:
repo_id: The repository ID for the model you want to use.filename: The path to the model file.n_gpu_layers: The number of GPU layers (default is -1).seed: The random seed to use for initialization (default is 1337).n_ctx: The context window size (default is 1000).verbose: Whether to print debug information (default isTrue).
Tools Integration
You can dynamically register tools that the model can use during interaction. If a message invokes a tool, it will be detected, and the corresponding tool will be executed.
Tools should be passed as part of the tools argument when calling the create or create_stream methods.
Example Tool Usage
If you have a tool, such as a request validation tool, you can register it and the model will use it if necessary.
tools = [
Tool(name="validate_request", description="Validates request data")
]
result = await client.create(messages, tools=tools)
Running Tests
To ensure that everything is working correctly, you can run tests with pytest:
pytest
Contributing
If you’d like to contribute to this extension, feel free to open an issue or submit a pull request.
License
This extension is open source and available under the MIT License. See the LICENSE file for more information.
Topics
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autogen_llama_cpp_chat_completion-0.1.2.tar.gz.
File metadata
- Download URL: autogen_llama_cpp_chat_completion-0.1.2.tar.gz
- Upload date:
- Size: 3.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0404f6d6730be7c4efba2753a9192b8433f31f4d7b20db5e1c65a11af833ca1
|
|
| MD5 |
acce057b78d66865a64fcd0036e47e5e
|
|
| BLAKE2b-256 |
0d9f01986e2e5ed9082e5119181212060d46e743c9a273a1c5be118865ca0e7c
|
File details
Details for the file autogen_llama_cpp_chat_completion-0.1.2-py3-none-any.whl.
File metadata
- Download URL: autogen_llama_cpp_chat_completion-0.1.2-py3-none-any.whl
- Upload date:
- Size: 4.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ccb4ae4fcb62f0a9dd0e11d7d77975866ca1bbe979f9f579a4a12c3d7cfdeb6
|
|
| MD5 |
185fd5dd7d8e5b52f7b8459e77efe6db
|
|
| BLAKE2b-256 |
a63841d55ab69879921cf4fee162be55c41db4806f6454f02c00a2ec474bdb52
|