Skip to main content

Sinapsis templates for llama.cpp text generation.

Project description



Sinapsis LLaMA CPP

Sinapsis templates for local GGUF-backed text completion, streaming, and MCP with llama-cpp-python.

🐍 Installation🚀 Features📚 Usage example📙 Documentation🔍 License

The sinapsis-llama-cpp package provides Sinapsis templates built on top of llama-cpp-python for running local or Hugging Face-hosted GGUF models through LLMConversationPacket.

🐍 Installation

Install using your preferred package manager. We strongly recommend using uv.

Install the base package:

uv pip install sinapsis-llama-cpp --extra-index-url https://pypi.sinapsis.tech

Or with raw pip:

pip install sinapsis-llama-cpp --extra-index-url https://pypi.sinapsis.tech

[!IMPORTANT] If you also want the upstream llama-cpp-python HTTP server, install the optional server extra:

uv pip install sinapsis-llama-cpp[server] --extra-index-url https://pypi.sinapsis.tech

Or install all optional dependencies:

uv pip install sinapsis-llama-cpp[all] --extra-index-url https://pypi.sinapsis.tech

🚀 Features

Templates Supported

  • LLaMACPPTextCompletion: Standard llama.cpp chat completion using LLMConversationPacket.
  • LLaMACPPStreamingTextCompletion: Async streaming variant that yields partial packets during generation.
  • LLaMACPPTextCompletionWithMCP: llama.cpp chat completion with packet-native MCP tool state.
🧩 Common Attributes
  • init_args (LLaMACPPInitArgs, required): llama.cpp runtime arguments.
    • llm_model_name (str, required): Hugging Face repo id or local directory containing the GGUF file.
    • llm_model_file (str, required): GGUF file name to load.
    • Additional runtime controls include n_ctx, n_threads, n_gpu_layers, flash_attn_type, tensor_split, use_mmap, use_mlock, seed, and chat_format.
  • completion_args (LLaMACPPCompletionArgs, required): Request-time generation parameters such as max_tokens, temperature, top_p, top_k, min_p, penalties, stop sequences, and structured-output settings.
  • reasoning_start_tag / reasoning_end_tag (str | None, optional): Tags used to extract reasoning into LLMConversationPacket.reasoning before the final response is cleaned.

[!TIP] Use CLI command sinapsis info --all-template-names to show a list with all the available Template names installed with Sinapsis LLaMA CPP.

📚 Usage example

The following agent runs one local llama.cpp text-completion step using LLMConversationInput.

Config
agent:
  name: text_completion
  description: Single-shot llama-cpp text completion for Q&A and text generation.

templates:
  - template_name: InputTemplate
    class_name: InputTemplate
    attributes: {}

  - template_name: LLMConversationInput
    class_name: LLMConversationInput
    template_input: InputTemplate
    attributes:
      prompt: Give three short tips for staying organized during a busy week.
      system_prompt: You are a helpful assistant.

  - template_name: LLaMACPPTextCompletion
    class_name: LLaMACPPTextCompletion
    template_input: LLMConversationInput
    attributes:
      init_args:
        llm_model_name: unsloth/Qwen3.5-9B-GGUF
        llm_model_file: Qwen3.5-9B-Q4_K_M.gguf
        n_ctx: 8192
        n_threads: 8
        n_gpu_layers: -1
        flash_attn_type: -1
        seed: 10
      completion_args:
        max_tokens: 4096
        temperature: 0.2
        seed: 10

📙 Documentation

Documentation for this and other sinapsis packages is available on the sinapsis website

Tutorials for different projects within sinapsis are available at sinapsis tutorials page

🔍 License

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.

For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinapsis_llama_cpp-0.5.0.tar.gz (29.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sinapsis_llama_cpp-0.5.0-py3-none-any.whl (32.1 kB view details)

Uploaded Python 3

File details

Details for the file sinapsis_llama_cpp-0.5.0.tar.gz.

File metadata

  • Download URL: sinapsis_llama_cpp-0.5.0.tar.gz
  • Upload date:
  • Size: 29.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.17

File hashes

Hashes for sinapsis_llama_cpp-0.5.0.tar.gz
Algorithm Hash digest
SHA256 4c67459c4640977effe87b374690d8ed65b5df7e60c73f29f85541a95b3b7fe7
MD5 264d7d06ebf969a41ba2937952ae2563
BLAKE2b-256 5001c1f1d19f5d96d24e336d113acd62c15460f31240243a9036bac6b8e8c28b

See more details on using hashes here.

File details

Details for the file sinapsis_llama_cpp-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sinapsis_llama_cpp-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 970214293768769b02b756a1fcafade9b54ee738988f1a39b5f137925153b634
MD5 dec500cbc91e61e9b4934e66590c472d
BLAKE2b-256 1e7dba45c79a62066afca2bc10f21e23d476fadcd3f87a5499934620211e6fdc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page