Batch CLI tool for running batch inference with local LLMs
Project description
LLM Batch
A command-line tool for running and managing batch inference jobs with LLM providers (OpenAI and Anthropic).
Installation
Using uv (recommended)
uv is a fast Python package installer and resolver.
# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install llm-batch
uv pip install llm-batch
Using pip
pip install llm-batch
Usage
The llm-batch tool offers several commands to create, run, and parse batches of LLM inference requests.
Available Commands
create: Create a batch of requests from a CSV or JSON filerun: Run a batch of requests through Ollama (requires Ollama installation with the model specified in your config)run-anthropic: Run a batch of Anthropic requests directly. Requires anANTHROPIC_API_KEYenvironmental variable.parse: Parse and convert batch results to CSV
Creating a Batch
llm-batch create INPUT_PATH CONFIG_FILE OUTPUT_PATH
INPUT_PATH: Path to a CSV or JSON file containing questionsCONFIG_FILE: Path to the YAML config file (see configuration section)OUTPUT_PATH: Path to save the output JSONL file
The input CSV file should have at least two columns:
question_id: A unique identifier for the questionquestion: The text of the questionimage_path(optional): Path to an image file for multimodal models
Running a Batch
llm-batch run FILE_PATH [--interval INTEGER] [--output-dir DIRECTORY] [--verbose]
FILE_PATH: Path to the JSONL file containing the batch--interval: Number of responses to save at once (default: 100)--output-dir: Directory to save the output (default: current directory)--verbose: Enable verbose logging
This command processes the batch requests through Ollama and saves the responses to a JSONL file.
Important: This command requires:
- Ollama to be installed on your system
- The model specified in your config to be downloaded in Ollama
ollama pull <model_name>
Running Anthropic Batches Directly
llm-batch run-anthropic FILE_PATH
FILE_PATH: Path to the JSONL file containing Anthropic requests
This command uses Anthropic's native batch API.
Parsing Results
llm-batch parse INPUT_PATH [OUTPUT_DIR]
INPUT_PATH: Path to the JSONL file containing batch responsesOUTPUT_DIR: Directory to save the parsed CSV file (default: current directory)
This command parses the batch responses and converts them to a CSV file.
Configuration
The tool uses a YAML configuration file to specify the parameters for creating batch requests. Here's an explanation of the configuration options:
# Format of the batch requests
format: openai # Options: "openai" or "anthropic"
# Number of responses to generate per question
# n_answers: 5 # Uncomment to set a different value (default: 1)
# Generation parameters
params:
model: gemma2:2b # Model to use for inference
temperature: 0.7 # Controls randomness
max_tokens: 8196 # Maximum tokens to generate
# Optional parameters
# top_p: 0.9 # Nucleus sampling parameter
# frequency_penalty: 0.0 # Penalize repeated tokens
# presence_penalty: 0.0 # Penalize tokens already present
# System message to include in the prompt (optional)
# system_message: |
# You are a helpful AI assistant.
# JSON Schema for structured output (optional)
# json_schema:
# name: response_model # Name of the schema
# schema: # JSON schema definition
# type: object
# properties:
# thinking:
# type: string
# answer:
# type: string
# required:
# - thinking
# - answer
# additionalProperties: false
# strict: true # Enforce strict schema
Important Configuration Notes
- format: Must be either "openai" or "anthropic" based on which provider you're using
- params: Contains model parameters like model, temperature, and token limits
- json_schema: Optional JSON schema for structured responses (useful for parsing)
Environment Variables
The tool requires API keys for the LLM providers you're using:
# For OpenAI
export OPENAI_API_KEY=your_openai_api_key
# For Anthropic
export ANTHROPIC_API_KEY=your_anthropic_api_key
You can also use a .env file in your project directory.
Example Workflow
- Prepare a CSV file with questions (
questions.csv) - Create a config file (
config.yaml) - Create a batch file:
llm-batch create questions.csv config.yaml batches/my_batch.jsonl
- Run the batch:
# Make sure Ollama is installed and the model is downloaded ollama pull <model_name> # Run the batch llm-batch run batches/my_batch.jsonl --output-dir results
- Parse the results:
llm-batch parse results/batch_*.jsonl results
License
See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_batch-0.1.7.tar.gz.
File metadata
- Download URL: llm_batch-0.1.7.tar.gz
- Upload date:
- Size: 2.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17f00300db88f6da925b3418a51a6b8b8642defa1db5d95bdfcd8455ad9b92a8
|
|
| MD5 |
bec6e010fc8481ae90604edadc4a9e0e
|
|
| BLAKE2b-256 |
9543e09e4b81c089a5d9ed842cb6eb761877bebd68185de8f63d7ef1fb04b430
|
File details
Details for the file llm_batch-0.1.7-py3-none-any.whl.
File metadata
- Download URL: llm_batch-0.1.7-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39701fa094b99e1d704f8c1f485d1a755e7f449557a2de5377523957a569d13e
|
|
| MD5 |
9dfe25e4162dd9fbdbbaf4b36a9fe57d
|
|
| BLAKE2b-256 |
9f52f7ba95d6d302f0b4b1dcdcbe9da7fda4a01fb7835553fc23f934d02a333c
|