Skip to main content

A chat completion client extension using Llama CPP, integrating with AutoGen for AI-powered chat interactions.

Project description

AutoGen Llama-CPP Chat Completion Extension

This extension provides a Chat Completion Client using the Llama-CPP model. It integrates with the AutoGen ecosystem, enabling AI-powered chat completion with access to external tools.

Installation

To install the autogen-llama-cpp-chat-completion extension, run the following command:

pip install autogen-llama-cpp-chat-completion

Dependencies

  • autogen-core>=0.4,<0.5
  • pydantic
  • llama-cpp

Usage

Once installed, you can integrate this extension into your AutoGen system for chat-based completions using the Llama-CPP model.

Example Usage

Here’s an example of how you can use the extension to create a chat session with Llama-CPP:

from autogen_llama_cpp_chat_completion.llama_cpp_extension import LlamaCppChatCompletionClient
from autogen_core.models import SystemMessage, UserMessage

# Initialize the LlamaCpp client
client = LlamaCppChatCompletionClient(
    repo_id="your_repo_id", 
    filename="path_to_model_file", 
    n_gpu_layers=-1, 
    seed=1337, 
    n_ctx=1000, 
    verbose=True
)

# Create chat messages
messages = [
    SystemMessage(content="You are a helpful assistant."),
    UserMessage(content="What is the capital of France?")
]

# Get a response from the model
result = await client.create(messages)

# Print the result
print(result.content)

Phi-4 Model Example

You can also use the phi-4 model for chat completions. Here's an example of how to integrate it with the extension:

from autogen_llama_cpp_chat_completion.llama_cpp_extension import LlamaCppChatCompletionClient
from autogen_core.models import SystemMessage, UserMessage

# Initialize the Phi-4 LlamaCpp client
model_client = LlamaCppChatCompletionClient(
    repo_id="unsloth/phi-4-GGUF",
    filename="phi-4-Q2_K_L.gguf",
    n_gpu_layers=-1,
    seed=1337,
    n_ctx=16384,
    verbose=False,
)

# Create chat messages
messages = [
    SystemMessage(content="You are an assistant with the Phi-4 model."),
    UserMessage(content="What is the latest breakthrough in AI research?")
]

# Get a response from the model
result = await model_client.create(messages)

# Print the result
print(result.content)

This example demonstrates how to use the phi-4 model, providing a larger context window (n_ctx=16384) and using the model from the "unsloth/phi-4-GGUF" repository.

Streaming Mode

You can also use streaming mode to generate responses incrementally:

response_generator = client.create_stream(messages)

# Iterate through the response stream
async for token in response_generator:
    print(token)

Configuration

When initializing the LlamaCppChatCompletionClient, you can provide the following configuration parameters:

  • repo_id: The repository ID for the model you want to use.
  • filename: The path to the model file.
  • n_gpu_layers: The number of GPU layers (default is -1).
  • seed: The random seed to use for initialization (default is 1337).
  • n_ctx: The context window size (default is 1000).
  • verbose: Whether to print debug information (default is True).

Tools Integration

You can dynamically register tools that the model can use during interaction. If a message invokes a tool, it will be detected, and the corresponding tool will be executed.

Tools should be passed as part of the tools argument when calling the create or create_stream methods.

Example Tool Usage

If you have a tool, such as a request validation tool, you can register it and the model will use it if necessary.

tools = [
    Tool(name="validate_request", description="Validates request data")
]

result = await client.create(messages, tools=tools)

Running Tests

To ensure that everything is working correctly, you can run tests with pytest:

pytest

Contributing

If you’d like to contribute to this extension, feel free to open an issue or submit a pull request.

License

This extension is open source and available under the MIT License. See the LICENSE file for more information.

Topics

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autogen_llama_cpp_chat_completion-0.1.1.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file autogen_llama_cpp_chat_completion-0.1.1.tar.gz.

File metadata

File hashes

Hashes for autogen_llama_cpp_chat_completion-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bd14192aa45ec7636cf32975756485606ccbbb0af39f1b379610ae4783190748
MD5 02840b7571a330563cdd2670122f0f4a
BLAKE2b-256 cf15b43eb23c2c2c06b8d71a227c74b86714e77d93baa63f143f2680ccd8772a

See more details on using hashes here.

File details

Details for the file autogen_llama_cpp_chat_completion-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for autogen_llama_cpp_chat_completion-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cadca163179fee69465157e0e9af06c016caf4cadd9c1b0b7fe196883e47a70b
MD5 c4a82430553b36009519a0eab660d8f1
BLAKE2b-256 04d9a4e7b5effb9f3ce88c2700eb31999a63d8e63fbf8ba0cc986e2b96b8e270

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page