Skip to main content

Batch CLI tool for running batch inference with local LLMs

Project description

📦 LLM Batch

A command-line tool for running and managing batch inference jobs with LLM providers (OpenAI and Anthropic).

Installation

Using uv (recommended)

uv is a fast Python package installer and resolver.

uv pip install llm-batch

Using pip

pip install llm-batch

Usage

The llm-batch tool offers several commands to create, run, and parse batches of LLM inference requests.

Available Commands

  • create: Create a batch of requests from a CSV or JSON file
  • run: Run a batch of requests through Ollama (requires Ollama installation with the model specified in your config)
  • run-anthropic: Run a batch of Anthropic requests directly. Requires an ANTHROPIC_API_KEY environmental variable.
  • parse: Parse and convert batch results to CSV

Creating a Batch

llm-batch create INPUT_PATH CONFIG_FILE OUTPUT_PATH
  • INPUT_PATH: Path to a CSV or JSON file containing questions
  • CONFIG_FILE: Path to the YAML config file (see configuration section)
  • OUTPUT_PATH: Path to save the output JSONL file

The input CSV file should have at least two columns:

  • question_id: A unique identifier for the question
  • question: The text of the question
  • image_path (optional): Path to an image file for multimodal models

Running a Batch

llm-batch run FILE_PATH [--interval INTEGER] [--output-dir DIRECTORY] [--verbose]
  • FILE_PATH: Path to the JSONL file containing the batch
  • --interval: Number of responses to save at once (default: 100)
  • --output-dir: Directory to save the output (default: current directory)
  • --verbose: Enable verbose logging

This command processes the batch requests through Ollama and saves the responses to a JSONL file.

Important: This command requires:

  1. Ollama to be installed on your system
  2. The model specified in your config to be downloaded in Ollama ollama pull <model_name>

Running Anthropic Batches Directly

llm-batch run-anthropic FILE_PATH
  • FILE_PATH: Path to the JSONL file containing Anthropic requests

This command uses Anthropic's native batch API.

Parsing Results

llm-batch parse INPUT_PATH [OUTPUT_DIR]
  • INPUT_PATH: Path to the JSONL file containing batch responses
  • OUTPUT_DIR: Directory to save the parsed CSV file (default: current directory)

This command parses the batch responses and converts them to a CSV file.

Configuration

The tool uses a YAML configuration file to specify the parameters for creating batch requests. Here's an explanation of the configuration options:

# Format of the batch requests
format: openai  # Options: "openai" or "anthropic"

# Number of responses to generate per question
# n_answers: 5    # Uncomment to set a different value (default: 1)

# Generation parameters
params:
  model: gemma2:2b         # Model to use for inference
  temperature: 0.7         # Controls randomness
  max_tokens: 8196         # Maximum tokens to generate

  # Optional parameters
  # top_p: 0.9              # Nucleus sampling parameter
  # frequency_penalty: 0.0  # Penalize repeated tokens
  # presence_penalty: 0.0   # Penalize tokens already present

# System message to include in the prompt (optional)
# system_message: |
#   You are a helpful AI assistant.

# JSON Schema for structured output (optional)
# json_schema:
#   name: response_model        # Name of the schema
#   schema:                     # JSON schema definition
#     type: object
#     properties:
#       thinking:
#         type: string
#       answer:
#         type: string
#     required:
#       - thinking
#       - answer
#     additionalProperties: false
#   strict: true               # Enforce strict schema 

Important Configuration Notes

  • format: Must be either "openai" or "anthropic" based on which provider you're using
  • params: Contains model parameters like model, temperature, and token limits
  • json_schema: Optional JSON schema for structured responses (useful for parsing)

Environment Variables

The tool requires API keys for the LLM providers you're using:

# For OpenAI
export OPENAI_API_KEY=your_openai_api_key

# For Anthropic
export ANTHROPIC_API_KEY=your_anthropic_api_key

You can also use a .env file in your project directory.

Example Workflow

  1. Prepare a CSV file with questions (questions.csv)
  2. Create a config file (config.yaml)
  3. Create a batch file:
    llm-batch create questions.csv config.yaml batches/my_batch.jsonl
    
  4. Run the batch:
    # Make sure Ollama is installed and the model is downloaded
    ollama pull <model_name>
    
    # Run the batch
    llm-batch run batches/my_batch.jsonl --output-dir results
    
  5. Parse the results:
    llm-batch parse results/batch_*.jsonl results
    

License

See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_batch-0.1.8.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_batch-0.1.8-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file llm_batch-0.1.8.tar.gz.

File metadata

  • Download URL: llm_batch-0.1.8.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.17

File hashes

Hashes for llm_batch-0.1.8.tar.gz
Algorithm Hash digest
SHA256 23118f8e426c4cd9d87ff22d7ddaad2ae060c94b85eee46577ae43f4b936becf
MD5 27e65ec134ffed8d9d0e21e14d0f6f31
BLAKE2b-256 d1002459f2342fbfb0e44d3e47670198e02f3015b666ffd12410fd46e65d3945

See more details on using hashes here.

File details

Details for the file llm_batch-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: llm_batch-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.17

File hashes

Hashes for llm_batch-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 f4404ff3582175e0eb2e805904941dff51758ade90d7b94d5f249f1531b8ea2a
MD5 808071ec58fae4f18eb9da2050d1d7a4
BLAKE2b-256 3aa77f6e2fe8734b8689f0f5a3d236b2a5298bf60199f0b8a903d0e1e6fa0c9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page