Package with templates using the LLama-CPP library LLM text completion

These details have not been verified by PyPI

Project links

Project description

sinapsis-llama-cpp

Package with support for the llama-cpp library to handle text processing

🐍 Installation • 🚀 Features • 📚 Usage example • 🌐 Webapps 📙 Documentation • 🔍 License

The sinapsis-llama-cpp module provides a suite of templates to run LLMs with llama-cpp.

[!IMPORTANT] We now include support for Llama4 models!

To use them, install the dependency (if you have not installed sinapsis-llama-cpp[all])

  uv pip install sinapsis-llama-cpp[llama-four] --extra-index-url https://pypi.sinapsis.tech

You need a HuggingFace token. See the official instructions and set it using

  export HF_TOKEN=<token-provided-by-hf>

and test it through the cli or the webapp by changing the AGENT_CONFIG_PATH

[!NOTE] Llama 4 requires large GPUs to run the models. Nonetheless, running on smaller consumer-grade GPUs is possible, although a single inference may take hours

🐍 Installation

Install using your package manager of choice. We encourage the use of uv

Example with uv:

  uv pip install sinapsis-llama-cpp --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-llama-cpp --extra-index-url https://pypi.sinapsis.tech

[!IMPORTANT] Templates may require extra dependencies. For development, we recommend installing the package with all the optional dependencies:

with uv:

  uv pip install sinapsis-llama-cpp[all] --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-llama-cpp[all] --extra-index-url https://pypi.sinapsis.tech

🚀 Features

* LLaMATextCompletion: Configures and initializes a chat completion model, supporting LLaMA, Mistral, and other compatible models.

🌍 General Attributes

These attributes apply to `LLaMATextCompletion`` :

llm_model_name(Required): Name of the LLM to use.
llm_model_file(Required): File path to the LLM.
n_ctx(Required): Maximum context size.
role: Role in the conversation (system, user, or assistant, default: assistant)
system_prompt (Optional): Defines the personality of the LLM (e.g., you are a python expert)
prompt: Custom instructions to guide the LLM response (default: empty).
chat_format: Chat message format (llama-2, chatml, etc., default: chatml).
context_max_len: Maximum conversation context length (default: 6).
pattern: Regex pattern to match delimiters (default: handles <|...|> and </...>).
keep_before: Determines which part of the matched text to return (default: True)
max_tokens: Maximum number of tokens to generate (default: 256).
temperature: Sampling temperature, controlling randomness (default: 0.5).
n_threads: Number of CPU threads to use (default: 4).
n_gpu_layers: Number of LLM layers offloaded to GPU (-1 for all layers, default: 0).

> [!IMPORTANT] > We now include support for Llama4 models!

To use them, install the dependency (if you have not installed sinapsis-llama-cpp[all])

  uv pip install sinapsis-llama-cpp[llama-four] --extra-index-url https://pypi.sinapsis.tech

and test it through the cli or the webapp by changing the AGENT_CONFIG_PATH

[!TIP] Use CLI command sinapsis info --all-template-names to show a list with all the available Template names installed with Sinapsis Data Tools.

[!TIP] Use CLI command sinapsis info --example-template-config TEMPLATE_NAME to produce an example Agent config for the Template specified in TEMPLATE_NAME.

For example, for LlaMATextCompletion use sinapsis info --example-template-config LlaMATextCompletion to produce the following example config:

agent:
  name: my_first_chatbot
  description: Agent with a template to pass a text through a LLM and return a response
templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}
- template_name: LLaMATextCompletion
  class_name: LLaMATextCompletion
  template_input: InputTemplate
  attributes:
    llm_model_name: 'bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF'
    llm_model_file: 'DeepSeek-R1-Distill-Qwen-7B-Q5_K_S.gguf'
    n_ctx: 9000
    max_tokens: 10000
    role: assistant
    system_prompt: 'You are an AI expert'
    chat_format: chatml
    context_max_len: 6
    pattern: null
    keep_before: true
    temperature: 0.5
    n_threads: 4
    n_gpu_layers: 8

📚 Usage example

The following agent passes a text message through a TextPacket and retrieves a response from a LLM

Config

agent:
  name: chat_completion
  description: Agent with a chatbot that makes a call to the LLM model using a context uploaded from a file

templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: { }

- template_name: TextInput
  class_name: TextInput
  template_input: InputTemplate
  attributes:
    text: what is AI?
- template_name: LLaMATextCompletion
  class_name: LLaMATextCompletion
  template_input: TextInput
  attributes:
    llm_model_name: bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF
    llm_model_file: DeepSeek-R1-Distill-Qwen-7B-Q5_K_S.gguf
    n_ctx: 9000
    max_tokens: 10000
    temperature: 0.7
    n_threads: 8
    n_gpu_layers: 29
    chat_format: chatml
    system_prompt : "You are a python and AI agents expert and you provided reasoning behind every answer you give."
    keep_before: True

🌐 Webapps

This module includes a webapp to interact with the model

[!IMPORTANT] To run the app you first need to clone this repository:

git clone git@github.com:Sinapsis-ai/sinapsis-chatbots.git
cd sinapsis-chatbots

[!NOTE] If you'd like to enable external app sharing in Gradio, export GRADIO_SHARE_APP=True

[!IMPORTANT] You can change the model name and the number of gpu_layers used by the model in case you have an Out of Memory (OOM) error

🐳 Docker

IMPORTANT This docker image depends on the sinapsis-nvidia:base image. Please refer to the official sinapsis instructions to Build with Docker.

Build the sinapsis-chatbots image:

docker compose -f docker/compose.yaml build

Start the container

docker compose -f docker/compose_apps.yaml up sinapsis-simple-chatbot -d

Check the status:

docker logs -f sinapsis-simple-chatbot

The logs will display the URL to access the webapp, e.g.,:

Running on local URL:  http://127.0.0.1:7860

NOTE: The url may be different, check the logs 4. To stop the app:

docker compose -f docker/compose_apps.yaml down

To use a different chatbot configuration (e.g. OpenAI-based chat), update the AGENT_CONFIG_PATH environmental variable to point to the desired YAML file.

For example, to use OpenAI chat:

environment:
 AGENT_CONFIG_PATH: webapps/configs/openai_simple_chat.yaml
 OPENAI_API_KEY: your_api_key

💻 UV

Export the environment variable to install the python bindings for llama-cpp

export CMAKE_ARGS="-DGGML_CUDA=on"
export FORCE_CMAKE="1"

export CUDACXX:

export CUDACXX=$(command -v nvcc)

Create the virtual environment and sync dependencies:

uv sync --frozen

Install the wheel:

uv pip install sinapsis-chatbots[all] --extra-index-url https://pypi.sinapsis.tech

Run the webapp:

uv run webapps/llama_cpp_simple_chatbot.py

NOTE: To use OpenAI for the simple chatbot, set your API key and specify the correct configuration file

export AGENT_CONFIG_PATH=webapps/configs/openai_simple_chat.yaml
export OPENAI_API_KEY=your_api_key

and run step 5 again

The terminal will display the URL to access the webapp, e.g.:

NOTE: The url can be different, check the output of the terminal

Running on local URL:  http://127.0.0.1:7860

📙 Documentation

Documentation for this and other sinapsis packages is available on the sinapsis website

Tutorials for different projects within sinapsis are available at sinapsis tutorials page

🔍 License

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.

For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.

The LLama4TextToText template is licensed under the official Llama4 license

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

Mar 25, 2026

0.4.4

Mar 3, 2026

0.4.3

Feb 27, 2026

0.4.2

Feb 26, 2026

0.4.1

Feb 25, 2026

0.4.0

Feb 19, 2026

0.3.14

Jan 15, 2026

0.3.13

Dec 9, 2025

0.3.12

Nov 10, 2025

0.3.11

Nov 3, 2025

0.3.10

Sep 8, 2025

0.3.9

Aug 29, 2025

0.3.8

Aug 19, 2025

0.3.7

Aug 5, 2025

0.3.6

Jul 28, 2025

0.3.5

Jun 3, 2025

0.3.4

May 2, 2025

0.3.3

Apr 30, 2025

0.3.2

Apr 30, 2025

This version

0.3.1

Apr 29, 2025

0.3.0

Apr 9, 2025

0.2.0

Apr 1, 2025

0.1.0

Mar 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinapsis_llama_cpp-0.3.1.tar.gz (30.3 kB view details)

Uploaded Apr 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sinapsis_llama_cpp-0.3.1-py3-none-any.whl (32.9 kB view details)

Uploaded Apr 29, 2025 Python 3

File details

Details for the file sinapsis_llama_cpp-0.3.1.tar.gz.

File metadata

Download URL: sinapsis_llama_cpp-0.3.1.tar.gz
Upload date: Apr 29, 2025
Size: 30.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.16

File hashes

Hashes for sinapsis_llama_cpp-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`aaa008aab7fd943c0ec09af46b5fefbf3bd04bffa39da2be3108da57e9d589cc`
MD5	`8ce8846fa1ee10a0d10e21c7c2ed1c25`
BLAKE2b-256	`a2d72e92c337040efcacda11fa2d72777a9a00d58761a9a60707f1f55131e008`

See more details on using hashes here.

File details

Details for the file sinapsis_llama_cpp-0.3.1-py3-none-any.whl.

File metadata

Download URL: sinapsis_llama_cpp-0.3.1-py3-none-any.whl
Upload date: Apr 29, 2025
Size: 32.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.16

File hashes

Hashes for sinapsis_llama_cpp-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1a3dfb33c4882474b8b0b2b7b0322bd1221893d262515c6186a50417661c4417`
MD5	`9d56ee1e7683efc5f5ed261b610b2e11`
BLAKE2b-256	`c4c9999ff733718e6e0f0a08803ed82d5d8c248aa2ed08bd972480ce2410f6d2`

See more details on using hashes here.

sinapsis-llama-cpp 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

sinapsis-llama-cpp

Package with support for the llama-cpp library to handle text processing

🐍 Installation

🚀 Features

📚 Usage example

🌐 Webapps

📙 Documentation

🔍 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes