A chat completion client extension using Llama CPP, integrating with AutoGen for AI-powered chat interactions.

Project description

AutoGen Llama-CPP Chat Completion Extension

This extension provides a Chat Completion Client using the Llama-CPP model. It integrates with the AutoGen ecosystem, enabling AI-powered chat completion with access to external tools.

Installation

To install the autogen-llama-cpp-chat-completion extension, run the following command:

pip install autogen-llama-cpp-chat-completion

Dependencies

autogen-core>=0.4,<0.5
pydantic
llama-cpp

Usage

Once installed, you can integrate this extension into your AutoGen system for chat-based completions using the Llama-CPP model.

Example Usage

Here’s an example of how you can use the extension to create a chat session with Llama-CPP:

from autogen_llama_cpp_chat_completion.llama_cpp_extension import LlamaCppChatCompletionClient
from autogen_core.models import SystemMessage, UserMessage

# Initialize the LlamaCpp client
client = LlamaCppChatCompletionClient(
    repo_id="your_repo_id", 
    filename="path_to_model_file", 
    n_gpu_layers=-1, 
    seed=1337, 
    n_ctx=1000, 
    verbose=True
)

# Create chat messages
messages = [
    SystemMessage(content="You are a helpful assistant."),
    UserMessage(content="What is the capital of France?")
]

# Get a response from the model
result = await client.create(messages)

# Print the result
print(result.content)

Phi-4 Model Example

You can also use the phi-4 model for chat completions. Here's an example of how to integrate it with the extension:

from autogen_llama_cpp_chat_completion.llama_cpp_extension import LlamaCppChatCompletionClient
from autogen_core.models import SystemMessage, UserMessage

# Initialize the Phi-4 LlamaCpp client
model_client = LlamaCppChatCompletionClient(
    repo_id="unsloth/phi-4-GGUF",
    filename="phi-4-Q2_K_L.gguf",
    n_gpu_layers=-1,
    seed=1337,
    n_ctx=16384,
    verbose=False,
)

# Create chat messages
messages = [
    SystemMessage(content="You are an assistant with the Phi-4 model."),
    UserMessage(content="What is the latest breakthrough in AI research?")
]

# Get a response from the model
result = await model_client.create(messages)

# Print the result
print(result.content)

This example demonstrates how to use the phi-4 model, providing a larger context window (n_ctx=16384) and using the model from the "unsloth/phi-4-GGUF" repository.

Streaming Mode

You can also use streaming mode to generate responses incrementally:

response_generator = client.create_stream(messages)

# Iterate through the response stream
async for token in response_generator:
    print(token)

Configuration

When initializing the LlamaCppChatCompletionClient, you can provide the following configuration parameters:

repo_id: The repository ID for the model you want to use.
filename: The path to the model file.
n_gpu_layers: The number of GPU layers (default is -1).
seed: The random seed to use for initialization (default is 1337).
n_ctx: The context window size (default is 1000).
verbose: Whether to print debug information (default is True).

Tools Integration

You can dynamically register tools that the model can use during interaction. If a message invokes a tool, it will be detected, and the corresponding tool will be executed.

Tools should be passed as part of the tools argument when calling the create or create_stream methods.

Example Tool Usage

If you have a tool, such as a request validation tool, you can register it and the model will use it if necessary.

tools = [
    Tool(name="validate_request", description="Validates request data")
]

result = await client.create(messages, tools=tools)

Running Tests

To ensure that everything is working correctly, you can run tests with pytest:

pytest

Contributing

If you’d like to contribute to this extension, feel free to open an issue or submit a pull request.

License

This extension is open source and available under the MIT License. See the LICENSE file for more information.

Topics

Project details

Release history Release notifications | RSS feed

This version

0.1.2

Jan 26, 2025

0.1.1

Jan 26, 2025

0.1.0

Jan 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autogen_llama_cpp_chat_completion-0.1.2.tar.gz (3.9 kB view details)

Uploaded Jan 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autogen_llama_cpp_chat_completion-0.1.2-py3-none-any.whl (4.1 kB view details)

Uploaded Jan 26, 2025 Python 3

File details

Details for the file autogen_llama_cpp_chat_completion-0.1.2.tar.gz.

File metadata

Download URL: autogen_llama_cpp_chat_completion-0.1.2.tar.gz
Upload date: Jan 26, 2025
Size: 3.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for autogen_llama_cpp_chat_completion-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`a0404f6d6730be7c4efba2753a9192b8433f31f4d7b20db5e1c65a11af833ca1`
MD5	`acce057b78d66865a64fcd0036e47e5e`
BLAKE2b-256	`0d9f01986e2e5ed9082e5119181212060d46e743c9a273a1c5be118865ca0e7c`

See more details on using hashes here.

File details

Details for the file autogen_llama_cpp_chat_completion-0.1.2-py3-none-any.whl.

File metadata

Download URL: autogen_llama_cpp_chat_completion-0.1.2-py3-none-any.whl
Upload date: Jan 26, 2025
Size: 4.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for autogen_llama_cpp_chat_completion-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3ccb4ae4fcb62f0a9dd0e11d7d77975866ca1bbe979f9f579a4a12c3d7cfdeb6`
MD5	`185fd5dd7d8e5b52f7b8459e77efe6db`
BLAKE2b-256	`a63841d55ab69879921cf4fee162be55c41db4806f6454f02c00a2ec474bdb52`

See more details on using hashes here.

autogen-llama-cpp-chat-completion 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project description

AutoGen Llama-CPP Chat Completion Extension

Installation

Dependencies

Usage

Example Usage

Phi-4 Model Example

Streaming Mode

Configuration

Tools Integration

Example Tool Usage

Running Tests

Contributing

License

Topics

Project details

Verified details

Maintainers

Unverified details

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes