CLI tool for querying multiple models with prompts from a CSV with schema support

Project description

ParaLLM

ParaLLM is a command-line tool and Python package for efficiently querying language models. It supports batch processing with multiple prompts and models, and includes structured JSON output via schemas.

Features

Multi-Model Querying: Query multiple LLMs simultaneously, comparing their outputs
CSV Input/Output: Use CSV files for batch processing of prompts
Structured JSON Output: Get responses formatted to JSON schemas or Pydantic models
High Performance: Leverages Bodo for parallel execution of model queries. RAG functionality uses regular pandas for simplicity and reliability.
Multiple Providers: Support for OpenAI, AWS Bedrock, and Google Gemini

Installation

pip install parallm

Note: ParaLLM requires Python 3.9+ due to Bodo’s minimum version.

Or install from source:

git clone https://github.com/strangeloopcanon/parallm.git
cd parallm
pip install -e .

You'll need to set up your API keys. For AWS Bedrock, ensure you have AWS credentials configured. For Gemini, set the GEMINI_API_KEY environment variable. The llm package is installed automatically.

Command-Line Usage

Batch Processing (CSV Files)

Process multiple prompts from a CSV file with one or more models:

# Default mode (OpenAI/llm)
parallm default data/prompts.csv --models gpt-4 claude-3-sonnet-20240229

# AWS Bedrock mode
parallm aws data/prompts.csv --models anthropic.claude-3-sonnet-20240229 amazon.titan-text-express-v1

# Gemini mode
parallm gemini data/prompts.csv --models gemini-2.0-flash

Single Prompt Processing

Process a single prompt with optional repeat functionality:

# Default mode (OpenAI/llm)
parallm default "What is the capital of France?" --models gpt-4 --repeat 5

# AWS Bedrock mode
parallm aws "What is the capital of France?" --models amazon.titan-text-express-v1 --repeat 5

# Gemini mode
parallm gemini "What is the capital of France?" --models gemini-2.0-flash --repeat 5

Structured Output

Get responses formatted according to a JSON schema or Pydantic model:

# Using a JSON schema
parallm default data/prompts.csv --models gpt-4o --schema '{
  "type": "object",
  "properties": {
    "answer": {"type": "string"},
    "confidence": {"type": "number", "minimum": 0, "maximum": 1}
  },
  "required": ["answer", "confidence"]
}'

# Using a schema from file
parallm default data/prompts.csv --models gpt-4o --schema schema.json

# Using a Pydantic model
parallm default data/prompts.csv --models gpt-4o --pydantic models.py:ResponseModel

Python API Usage

Batch Processing

from parallm import query_model_all, bedrock_query_model_all, gemini_query_model_all

# Default mode (OpenAI/llm)
df = query_model_all("data/prompts.csv", ["gpt-4", "claude-3-sonnet-20240229"])
print(df)

# AWS Bedrock
df = bedrock_query_model_all("data/prompts.csv", ["anthropic.claude-3-sonnet-20240229"])
print(df)

# Gemini
df = gemini_query_model_all("data/prompts.csv", ["gemini-2.0-flash"])
print(df)

Single Prompt Processing

from parallm import query_model_repeat, bedrock_query_model_repeat, gemini_query_model_repeat

# Default mode (OpenAI/llm)
df = query_model_repeat("What is the capital of France?", "gpt-4o", repeat=5)
print(df)

# AWS Bedrock
df = bedrock_query_model_repeat("What is the capital of France?", "amazon.titan-text-express-v1", repeat=5)
print(df)

# Gemini
df = gemini_query_model_repeat("What is the capital of France?", "gemini-2.0-flash", repeat=5)
print(df)

Structured Output

from parallm import query_model_json
from pydantic import BaseModel

# Using a Pydantic model
class Response(BaseModel):
    answer: str
    confidence: float

result = query_model_json("What is the capital of France?", "gpt-4o", schema=Response)
print(result)

Retrieval-Augmented Generation (RAG)

ParaLLM now includes a modular RAG pipeline to allow querying language models with context retrieved from your own documents.

Overview

The RAG system processes your documents through a configurable pipeline:

Ingestion: Loads documents from a specified directory. Supports .txt, .pdf, .docx, and .html/.htm files.
Chunking: Splits documents into smaller chunks using different strategies:
- fixed_size: Overlapping chunks of a defined character size.
- semantic: Groups sentences together (using NLTK).
Embedding: Generates vector embeddings for each chunk using a specified Sentence Transformer model (e.g., all-MiniLM-L6-v2).
Indexing: Stores the chunks, embeddings, and metadata in:
- A vector store (currently ChromaDB) for semantic search.
- A keyword index (using BM25) for lexical search.

When querying, the system retrieves relevant chunks using vector search, keyword search, or a hybrid combination, augments the prompt with this context, and then sends it to the specified language model.

Configuration (`rag_config.yaml`)

The entire RAG pipeline is configured using a YAML file (e.g., rag_config.yaml). This file defines the sequence of steps, parameters for each step (like source paths, chunking strategy, embedding model, index paths), and the retrieval strategy.

See examples/rag_config.yaml for a detailed example.

RAG CLI Usage

Use the rag subcommand for building indexes and querying.

1. Build the RAG Index:

This command runs the ingestion, chunking, embedding, and indexing pipeline defined in your config file.

python -m parallm rag build --config path/to/your_rag_config.yaml

Replace path/to/your_rag_config.yaml with the actual path to your configuration file.
This needs to be run once initially and then again whenever your source documents or pipeline configuration change.
Indexes (ChromaDB, BM25 pickle file) will be created/updated based on paths specified in the config.

2. Query the RAG System:

This command uses a previously built index to retrieve context, augment a prompt, and query an LLM.

python -m parallm rag query --config path/to/your_rag_config.yaml --query "Your question here?" --llm-model gpt-4o-mini

--config: Specifies the RAG configuration file (used to load the retriever and embedding models).
--query / -q: The question you want to ask.
--llm-model: (Optional) The language model to use for generating the final answer (defaults to the model specified in the script, e.g., gpt-4o-mini).

RAG Dependencies

Using the RAG features requires additional dependencies:

PyYAML          # For parsing rag_config.yaml
sentence-transformers # For embedding generation
chromadb        # Vector store
rank_bm25       # Keyword indexing
pypdf           # PDF ingestion
python-docx     # DOCX ingestion
beautifulsoup4  # HTML ingestion
lxml            # HTML parsing backend for beautifulsoup4
nltk            # Semantic chunking (sentence tokenization)
reportlab       # Required by test suite to generate test PDFs

Ensure NLTK's punkt tokenizer data is downloaded: python -m nltk.downloader punkt

Testing

The project includes comprehensive test coverage for all RAG functionality. Run tests with:

# Run all tests
python -m pytest

# Run only RAG tests
python -m pytest tests/test_rag/

# Run with verbose output
python -m pytest tests/test_rag/ -v

All 35 RAG tests currently pass, covering:

Document ingestion (TXT, PDF, DOCX, HTML)
Text chunking (fixed-size and semantic strategies)
Embedding generation
BM25 indexing and retrieval
Configuration loading and validation

CSV Format

Your prompts.csv file should have a header row with "prompt" as the column name:

prompt
What is machine learning?
Explain quantum computing
How does blockchain work?

Dependencies

bodo: Provides parallel DataFrame processing for model query operations. RAG components use regular pandas for maximum compatibility.
pandas: Data processing and CSV handling
llm: Simon Willison's LLM interface library
python-dotenv: Environment variable management
pydantic: Data validation for structured output
boto3: AWS SDK for Python (required for AWS Bedrock)
Google GenAI: Gemini API client (google-genai or equivalent) for Gemini

Author

Rohit Krishnan

Project details

Release history Release notifications | RSS feed

This version

0.2.5

Sep 24, 2025

0.2.4

Sep 10, 2025

0.2.3

May 6, 2025

0.2.2

May 6, 2025

0.2.1

May 6, 2025

0.2.0

May 1, 2025

0.1.21

Apr 19, 2025

0.1.20

Apr 18, 2025

0.1.19

Apr 18, 2025

0.1.18

Apr 18, 2025

0.1.17

Apr 16, 2025

0.1.16

Apr 16, 2025

0.1.15

Apr 16, 2025

0.1.14

Apr 16, 2025

0.1.13

Apr 14, 2025

0.1.12

Apr 11, 2025

0.1.11

Apr 10, 2025

0.1.10

Apr 10, 2025

0.1.9

Apr 9, 2025

0.1.8

Apr 8, 2025

0.1.7

Apr 8, 2025

0.1.6

Apr 8, 2025

0.1.5

Apr 8, 2025

0.1.3

Apr 7, 2025

0.1.2

Apr 7, 2025

0.1.1

Apr 7, 2025

0.1.0

Apr 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parallm-0.2.5.tar.gz (34.4 kB view details)

Uploaded Sep 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

parallm-0.2.5-py3-none-any.whl (33.3 kB view details)

Uploaded Sep 24, 2025 Python 3

File details

Details for the file parallm-0.2.5.tar.gz.

File metadata

Download URL: parallm-0.2.5.tar.gz
Upload date: Sep 24, 2025
Size: 34.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.6

File hashes

Hashes for parallm-0.2.5.tar.gz
Algorithm	Hash digest
SHA256	`663930a48e62896db122545b3a722f0049897a57061a2fd41d7a72b34137b478`
MD5	`6362a11cf23aaf532399b54349fcecef`
BLAKE2b-256	`2d13bfb1c2eb52c0a280b8a9cc3cc6a0b8bf88b277400278a5f0765917e5003d`

See more details on using hashes here.

File details

Details for the file parallm-0.2.5-py3-none-any.whl.

File metadata

Download URL: parallm-0.2.5-py3-none-any.whl
Upload date: Sep 24, 2025
Size: 33.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.6

File hashes

Hashes for parallm-0.2.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a0c21f28afa948b92450b0f1eac37381f30459337c1d7d545c0b15443e8b8f69`
MD5	`bb8a7f4d58e19514195656fda1bfceb9`
BLAKE2b-256	`90084f0ab701a7b3d9b5e7955e2410688fb9d285ecf93d30bc36aa47ef4dd1d4`

See more details on using hashes here.

parallm 0.2.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

ParaLLM

Features

Installation

Command-Line Usage

Batch Processing (CSV Files)

Single Prompt Processing

Structured Output

Python API Usage

Batch Processing

Single Prompt Processing

Structured Output

Retrieval-Augmented Generation (RAG)

Overview

Configuration (rag_config.yaml)

RAG CLI Usage

RAG Dependencies

Testing

CSV Format

Dependencies

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Configuration (`rag_config.yaml`)