Python client and tools for KoboldCPP API
Project description
Basic functions
KoboldCpp API Interface
Allows easy use of basic KoboldCpp API endpoints, including streaming generations, images, samplers.
Instruct Template Wrapping
Finds the appropriate instruct template for the running model and wraps it around content to create a prompt.
Chunking
Will read most types of document and chunk them any size up to max context. Stops at natural break points. Returns the chunks as a list.
Guide to Using the KoboldCPP API with Python
Introduction
KoboldCPP is a powerful and portable solution for running Large Language Models (LLMs). Its standout features include:
- Zero-installation deployment with single executable
- Support for any GGUF model compatible with LlamaCPP
- Cross-platform support (Linux, Windows, macOS)
- Hardware acceleration via CUDA and Vulkan
- Built-in GUI with extensive features
- Multimodal capabilities (image generation, speech, etc.)
- API compatibility with OpenAI and Ollama
Quick Start
Basic Setup
- Download the KoboldCPP executable for your platform
- Place your GGUF model file in the same directory
- Install the Python client:
git clone https://github.com/jabberjabberjabber/koboldapi-python
cd koboldapi-python
pip install git+https://github.com/jabberjabberjabber/koboldapi-python.git
First Steps
Here's a minimal example to get started:
from koboldapi import KoboldAPI
# Initialize the client
api = KoboldAPI("http://localhost:5001")
# Basic text generation
response = api.generate(
prompt="Write a haiku about programming:",
max_length=50,
temperature=0.7
)
print(response)
Core Concepts
Configuration Management
The KoboldAPIConfig class manages configuration settings for the API client. You can either create a config programmatically or load it from a JSON file:
from koboldapi import KoboldAPIConfig
# Create config programmatically
config = KoboldAPIConfig(
api_url="http://localhost:5001",
api_password="",
templates_directory="./templates",
translation_language="English",
temp=0.7,
top_k=40,
top_p=0.9,
rep_pen=1.1
)
# Or load from JSON file
config = KoboldAPIConfig.from_json("config.json")
# Save config to file
config.to_json("new_config.json")
Example config.json:
{
"api_url": "http://localhost:5001",
"api_password": "",
"templates_directory": "./templates",
"translation_language": "English",
"temp": 0.7,
"top_k": 40,
"top_p": 0.9,
"rep_pen": 1.1
}
Template Management
KoboldAPI supports various instruction formats through templates. The InstructTemplate class handles this automatically:
from koboldapi.templates import InstructTemplate
template = InstructTemplate("./templates", "http://localhost:5001")
# Wrap a prompt with the appropriate template
wrapped_prompt = template.wrap_prompt(
instruction="Explain quantum computing",
content="Focus on qubits and superposition",
system_instruction="You are a quantum physics expert"
)
Example Applications
Text Processing
The library includes example scripts for various text processing tasks:
from koboldapi import KoboldAPICore
from koboldapi.chunking.processor import ChunkingProcessor
# Initialize core with config
config = {
"api_url": "http://localhost:5001",
"templates_directory": "./templates"
}
core = KoboldAPICore(config)
# Process a text file
processor = ChunkingProcessor(core.api_client, max_chunk_length=2048)
chunks, metadata = processor.chunk_file("document.txt")
# Generate summary for each chunk
for chunk, _ in chunks:
summary = core.api_client.generate(
prompt=core.template_wrapper.wrap_prompt(
instruction="Summarize this text",
content=chunk
)[0],
max_length=200
)
print(summary)
Image Processing
Process images:
from koboldapi import KoboldAPICore
from pathlib import Path
# Initialize core
config = {
"api_url": "http://localhost:5001",
"templates_directory": "./templates"
}
core = KoboldAPICore(config)
# Process image
image_path = Path("image.png")
with open(image_path, "rb") as f:
image_data = base64.b64encode(f.read()).decode()
result = core.api_client.generate(
prompt=core.template_wrapper.wrap_prompt(
instruction="Extract text from this image",
system_instruction="You are an OCR system"
)[0],
images=[image_data],
temperature=0.1
)
print(result)
Advanced Features
Custom Template Creation
Create custom instruction templates for different models:
{
"name": ["vicuna-7b", "vicuna-13b"],
"system_start": "### System:\n",
"system_end": "\n\n",
"user_start": "### Human: ",
"user_end": "\n\n",
"assistant_start": "### Assistant: ",
"assistant_end": "\n\n"
}
Generation Parameters
Fine-tune generation settings:
response = api.generate(
prompt="Write a story:",
max_length=500,
temperature=0.8, # Higher = more creative
top_p=0.9, # Nucleus sampling threshold
top_k=40, # Top-k sampling threshold
rep_pen=1.1, # Repetition penalty
rep_pen_range=256, # How far back to apply rep penalty
min_p=0.05 # Minimum probability threshold
)
Error Handling
Implement robust error handling:
from koboldapi import KoboldAPIError
try:
response = api.generate(prompt="Test prompt")
except KoboldAPIError as e:
print(f"API Error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
Performance Optimization
Context Management
Optimize token usage:
# Get max context length
max_length = api.get_max_context_length()
# Count tokens in prompt
token_count = api.count_tokens(prompt)["count"]
# Ensure we stay within limits
available_tokens = max_length - token_count
response_length = min(desired_length, available_tokens)
Batch Processing
Handle multiple inputs efficiently:
async def process_batch(prompts):
results = []
for prompt in prompts:
async for token in api.stream_generate(prompt):
results.append(token)
return results
Troubleshooting
Common Issues
- Connection Errors
# Test connection
if not api.validate_connection():
print("Cannot connect to API")
- Template Errors
# Check if template exists
if not template.get_template():
print("No matching template found for model")
- Generation Errors
# Monitor generation status
status = api.check_generation()
if status is None:
print("Generation failed or was interrupted")
Contributing
Contributions to improve these tools are welcome. Please submit issues and pull requests on GitHub.
Development Setup
- Clone the repository
- Install development dependencies:
pip install -e ".[dev]"
- Run tests:
pytest tests/
License
This project is licensed under the GPLv3 license.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file koboldapi-0.1.0.tar.gz.
File metadata
- Download URL: koboldapi-0.1.0.tar.gz
- Upload date:
- Size: 31.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8f3a1e28b70a62f391292859fd3a0230e66e04bdb26d89d0b245b0e42b0c9d70
|
|
| MD5 |
1f6e861032b7e277d0f6078ee8fe3ded
|
|
| BLAKE2b-256 |
f3fad2a448654f39926479341c6c84267c90fd953aa0097391d7ef736e0118cf
|
File details
Details for the file koboldapi-0.1.0-py3-none-any.whl.
File metadata
- Download URL: koboldapi-0.1.0-py3-none-any.whl
- Upload date:
- Size: 30.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5c473ca9859ee21d19cc6a050dc175b61430bc4bbf34aff37673da958160e2e
|
|
| MD5 |
9e1a877fdb3a181aa554447e5b67ff01
|
|
| BLAKE2b-256 |
d9da245381c58547b8dd2ea4adc00655cc886b236c7b924f6dd3f75dead4c48f
|