Python client and tools for KoboldCPP API

These details have not been verified by PyPI

Project description

Basic functions

KoboldCpp API Interface

Allows easy use of basic KoboldCpp API endpoints, including streaming generations, images, samplers.

Instruct Template Wrapping

Finds the appropriate instruct template for the running model and wraps it around content to create a prompt.

Chunking

Will read most types of document and chunk them any size up to max context. Stops at natural break points. Returns the chunks as a list.

Guide to Using the KoboldCPP API with Python

Introduction

KoboldCPP is a powerful and portable solution for running Large Language Models (LLMs). Its standout features include:

Zero-installation deployment with single executable
Support for any GGUF model compatible with LlamaCPP
Cross-platform support (Linux, Windows, macOS)
Hardware acceleration via CUDA and Vulkan
Built-in GUI with extensive features
Multimodal capabilities (image generation, speech, etc.)
API compatibility with OpenAI and Ollama

Quick Start

Basic Setup

Download the KoboldCPP executable for your platform
Place your GGUF model file in the same directory
Install the Python client:

git clone https://github.com/jabberjabberjabber/koboldapi-python
cd koboldapi-python
pip install git+https://github.com/jabberjabberjabber/koboldapi-python.git

First Steps

Here's a minimal example to get started:

from koboldapi import KoboldAPI

# Initialize the client
api = KoboldAPI("http://localhost:5001")

# Basic text generation
response = api.generate(
    prompt="Write a haiku about programming:",
    max_length=50,
    temperature=0.7
)
print(response)

Core Concepts

Configuration Management

The KoboldAPIConfig class manages configuration settings for the API client. You can either create a config programmatically or load it from a JSON file:

from koboldapi import KoboldAPIConfig

# Create config programmatically
config = KoboldAPIConfig(
    api_url="http://localhost:5001",
    api_password="",
    templates_directory="./templates",
    translation_language="English",
    temp=0.7,
    top_k=40,
    top_p=0.9,
    rep_pen=1.1
)

# Or load from JSON file
config = KoboldAPIConfig.from_json("config.json")

# Save config to file
config.to_json("new_config.json")

Example config.json:

{
    "api_url": "http://localhost:5001",
    "api_password": "",
    "templates_directory": "./templates",
    "translation_language": "English",
    "temp": 0.7,
    "top_k": 40,
    "top_p": 0.9,
    "rep_pen": 1.1
}

Template Management

KoboldAPI supports various instruction formats through templates. The InstructTemplate class handles this automatically:

from koboldapi.templates import InstructTemplate

template = InstructTemplate("./templates", "http://localhost:5001")

# Wrap a prompt with the appropriate template
wrapped_prompt = template.wrap_prompt(
    instruction="Explain quantum computing",
    content="Focus on qubits and superposition",
    system_instruction="You are a quantum physics expert"
)

Example Applications

Text Processing

The library includes example scripts for various text processing tasks:

from koboldapi import KoboldAPICore
from koboldapi.chunking.processor import ChunkingProcessor

# Initialize core with config
config = {
    "api_url": "http://localhost:5001",
    "templates_directory": "./templates"
}
core = KoboldAPICore(config)

# Process a text file
processor = ChunkingProcessor(core.api_client, max_chunk_length=2048)
chunks, metadata = processor.chunk_file("document.txt")

# Generate summary for each chunk
for chunk, _ in chunks:
    summary = core.api_client.generate(
        prompt=core.template_wrapper.wrap_prompt(
            instruction="Summarize this text",
            content=chunk
        )[0],
        max_length=200
    )
    print(summary)

Image Processing

Process images:

from koboldapi import KoboldAPICore
from pathlib import Path

# Initialize core
config = {
    "api_url": "http://localhost:5001",
    "templates_directory": "./templates"
}
core = KoboldAPICore(config)

# Process image
image_path = Path("image.png")
with open(image_path, "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

result = core.api_client.generate(
    prompt=core.template_wrapper.wrap_prompt(
        instruction="Extract text from this image",
        system_instruction="You are an OCR system"
    )[0],
    images=[image_data],
    temperature=0.1
)
print(result)

Advanced Features

Custom Template Creation

Create custom instruction templates for different models:

{
    "name": ["vicuna-7b", "vicuna-13b"],
    "system_start": "### System:\n",
    "system_end": "\n\n",
    "user_start": "### Human: ",
    "user_end": "\n\n",
    "assistant_start": "### Assistant: ",
    "assistant_end": "\n\n"
}

Generation Parameters

Fine-tune generation settings:

response = api.generate(
    prompt="Write a story:",
    max_length=500,
    temperature=0.8,      # Higher = more creative
    top_p=0.9,           # Nucleus sampling threshold
    top_k=40,            # Top-k sampling threshold
    rep_pen=1.1,         # Repetition penalty
    rep_pen_range=256,   # How far back to apply rep penalty
    min_p=0.05          # Minimum probability threshold
)

Error Handling

Implement robust error handling:

from koboldapi import KoboldAPIError

try:
    response = api.generate(prompt="Test prompt")
except KoboldAPIError as e:
    print(f"API Error: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

Performance Optimization

Context Management

Optimize token usage:

# Get max context length
max_length = api.get_max_context_length()

# Count tokens in prompt
token_count = api.count_tokens(prompt)["count"]

# Ensure we stay within limits
available_tokens = max_length - token_count
response_length = min(desired_length, available_tokens)

Batch Processing

Handle multiple inputs efficiently:

async def process_batch(prompts):
    results = []
    for prompt in prompts:
        async for token in api.stream_generate(prompt):
            results.append(token)
    return results

Troubleshooting

Common Issues

Connection Errors

# Test connection
if not api.validate_connection():
    print("Cannot connect to API")

Template Errors

# Check if template exists
if not template.get_template():
    print("No matching template found for model")

Generation Errors

# Monitor generation status
status = api.check_generation()
if status is None:
    print("Generation failed or was interrupted")

Contributing

Contributions to improve these tools are welcome. Please submit issues and pull requests on GitHub.

Development Setup

Clone the repository
Install development dependencies:

pip install -e ".[dev]"

Run tests:

pytest tests/

License

This project is licensed under the GPLv3 license.

Project details

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Developers
Programming Language

Release history Release notifications | RSS feed

0.5.4

Feb 28, 2025

0.5.3

Feb 23, 2025

0.5.2

Jan 25, 2025

0.5.1

Jan 23, 2025

0.5.0

Jan 23, 2025

0.2.1

Jan 15, 2025

0.2.0

Jan 15, 2025

This version

0.1.0

Jan 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

koboldapi-0.1.0.tar.gz (31.2 kB view details)

Uploaded Jan 13, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

koboldapi-0.1.0-py3-none-any.whl (30.0 kB view details)

Uploaded Jan 13, 2025 Python 3

File details

Details for the file koboldapi-0.1.0.tar.gz.

File metadata

Download URL: koboldapi-0.1.0.tar.gz
Upload date: Jan 13, 2025
Size: 31.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.10.13

File hashes

Hashes for koboldapi-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8f3a1e28b70a62f391292859fd3a0230e66e04bdb26d89d0b245b0e42b0c9d70`
MD5	`1f6e861032b7e277d0f6078ee8fe3ded`
BLAKE2b-256	`f3fad2a448654f39926479341c6c84267c90fd953aa0097391d7ef736e0118cf`

See more details on using hashes here.

File details

Details for the file koboldapi-0.1.0-py3-none-any.whl.

File metadata

Download URL: koboldapi-0.1.0-py3-none-any.whl
Upload date: Jan 13, 2025
Size: 30.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.10.13

File hashes

Hashes for koboldapi-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a5c473ca9859ee21d19cc6a050dc175b61430bc4bbf34aff37673da958160e2e`
MD5	`9e1a877fdb3a181aa554447e5b67ff01`
BLAKE2b-256	`d9da245381c58547b8dd2ea4adc00655cc886b236c7b924f6dd3f75dead4c48f`

See more details on using hashes here.

koboldapi 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Basic functions

KoboldCpp API Interface

Instruct Template Wrapping

Chunking

Guide to Using the KoboldCPP API with Python

Introduction

Quick Start

Basic Setup

First Steps

Core Concepts

Configuration Management

Template Management

Example Applications

Text Processing

Image Processing

Advanced Features

Custom Template Creation

Generation Parameters

Error Handling

Performance Optimization

Context Management

Batch Processing

Troubleshooting

Common Issues

Contributing

Development Setup

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes