An experimental framework for LLM discussions with judge oversight

These details have not been verified by PyPI

Project links

Project description

JuryLLM

Overview

JuryLLM is an experimental framework that orchestrates multiple language models to work collaboratively, similar to a jury system, to solve complex problems. By leveraging the power of ensemble decision-making, this project aims to demonstrate how smaller, open-source LLM models can work together to produce more robust and intelligent solutions.

The major breakthrough in human intelligence occurred when we learned to communicate more effectively. Unlike other highly intelligent species that went extinct, our ability to communicate and collaborate set us apart. The foundations of our progress have always been rooted in effective communication, teamwork, and collective focus toward shared goals. Even the open-source movement embodies this spirit of collaboration, showcasing how working together can drive innovation and success.

Key Features

Model Ensemble: Integrates multiple language models to work as a collaborative unit
Jury-like Decision Making: Implements a structured approach for models to deliberate and reach consensus
Open Source Focus: Primarily works with accessible, open-source language models
Collaborative Intelligence: Harnesses diverse model perspectives for enhanced problem-solving

Purpose

The primary goals of JuryLLM are:

Explore the potential of collaborative AI decision-making
Demonstrate how smaller models can achieve superior results through teamwork
Provide an experimental platform for testing ensemble-based approaches
Create more reliable and well-rounded AI solutions

Technical Architecture

The system is designed as a modular framework where:

Multiple language models act as jury members
Each model contributes its unique perspective
A coordinated decision-making process synthesizes various inputs
The final output represents a collective intelligence solution

Use Cases

Complex problem-solving requiring multiple perspectives and using multiple specialist models
Scenarios where consensus-based decision making is valuable
Tasks benefiting from diverse model capabilities
Experimental research in collaborative AI systems

Contributing

We welcome contributions to this experimental project! Whether you're interested in:

Adding new model integrations
Improving the consensus mechanism
Optimising prompts
Adding better fine-tuned models
Enhancing the documentation
Sharing interesting use cases

License

MIT License

Documentation

Architecture

The JuryLLM framework consists of three main components:

Participants (model.py)
- BaseParticipant: Abstract base class for all participants
- OllamaParticipant: Implementation for local Ollama models
- OpenAIParticipant: Implementation for OpenAI API models
- Judge: Specialized participant that evaluates discussions and provides verdicts
Discussion Management (jury.py)
- Discussion: Core class that manages the conversation flow
- Handles async streaming of responses
- Manages discussion rounds and verdict checking
- Formats case prompts and maintains discussion history
Message System
- Message: Data structure for communication
- Tracks role, content, and participant information
- Maintains conversation context

Setup Instructions

Prerequisites

# Install Python 3.8+ and pip
# Install Ollama (for local models)
brew install ollama

Installation

# Clone the repository
git clone https://github.com/yourusername/juryLLM.git
cd juryLLM

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Pull Required Models

# Pull Ollama models
ollama pull llama2:13b
ollama pull llama3.2:3b
ollama pull phi3.5:3.8b

Configuration
- Set environment variables if using OpenAI models:
```
export OPENAI_API_KEY=your_api_key
```

Running the Framework

Basic Usage

# Run the example discussion
python app.py

Custom Implementation

from juryLLM.model import OllamaParticipant, Judge
from juryLLM.jury import Discussion

# Create participants
participants = [
    OllamaParticipant(name="Model1", model_id="llama2:13b"),
    OllamaParticipant(name="Model2", model_id="phi3.5:3.8b")
]

# Create judge
judge = Judge(name="Judge", model_id="llama2:13b")

# Initialize discussion
discussion = Discussion(participants=participants, judge=judge)

# Run discussion
async for response in discussion.discuss(your_case_study):
    print(response)

Approach and Design Decisions

Asynchronous Processing
- Uses asyncio for non-blocking operations
- Implements streaming responses for real-time interaction
- Handles multiple model responses concurrently
Modular Architecture
- Easily extensible for new model types
- Separation of concerns between participants and discussion management
- Clean interfaces for adding new functionality
Judge Implementation
- Monitors discussion quality and relevance
- Prevents hallucination and maintains factual accuracy
- Provides clear verdicts based on discussion context
Error Handling
- Graceful handling of model failures
- Proper context management
- Clear error messages and logging

Example Use Cases

Complex Problem Solving

case_study = """
Case: Complex mathematical problem with multiple rules
Rules:
1. Rule one details...
2. Rule two details...
Questions:
1. Question one...
2. Question two...
"""

Decision Making

case_study = """
Case: Ethical decision scenario
Context: [Scenario details]
Questions to consider:
1. Ethical implications
2. Practical considerations
"""

Best Practices

Model Selection
- Choose models based on task requirements
- Consider model strengths and weaknesses
- Balance between performance and resource usage
Prompt Engineering
- Provide clear, structured case studies
- Include relevant context and constraints
- Define expected output format
Performance Optimization
- Use appropriate model sizes
- Implement proper caching strategies
- Monitor resource usage

Troubleshooting

Common issues and solutions:

Model loading errors: Ensure Ollama is running and models are pulled
Memory issues: Adjust model sizes or reduce participant count
Async errors: Check for proper async/await usage
API rate limits: Implement proper rate limiting for API calls

Note: This is an experimental project aimed at exploring collaborative AI approaches. The system is under active development and subject to changes.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Dec 11, 2024

0.1.1

Dec 11, 2024

0.1.0

Dec 11, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

juryllm-0.1.2.tar.gz (18.2 kB view details)

Uploaded Dec 11, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

juryLLM-0.1.2-py3-none-any.whl (8.4 kB view details)

Uploaded Dec 11, 2024 Python 3

File details

Details for the file juryllm-0.1.2.tar.gz.

File metadata

Download URL: juryllm-0.1.2.tar.gz
Upload date: Dec 11, 2024
Size: 18.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.13.0

File hashes

Hashes for juryllm-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`bfbb74d6bfedb91c43b5ea1facc4045fa60a25e3626abcf63681a1e34613459c`
MD5	`ac0df49a1fc0534b137f26fd82cf2f9e`
BLAKE2b-256	`55334043668541146b44f6527e5c904ad246887aa9809e74919eaccc0e7b12af`

See more details on using hashes here.

File details

Details for the file juryLLM-0.1.2-py3-none-any.whl.

File metadata

Download URL: juryLLM-0.1.2-py3-none-any.whl
Upload date: Dec 11, 2024
Size: 8.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.13.0

File hashes

Hashes for juryLLM-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ed726c73165e064fb65f75c748db398e390c2b1a740d16b462d4c7413c1a5796`
MD5	`b483b142467db4b09ebc4de050248ebb`
BLAKE2b-256	`e823719ea27780d399edcbe7261445a8632e66d1cfe0756a9a6d4bc9927da499`

See more details on using hashes here.

juryLLM 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

JuryLLM

Overview

Key Features

Purpose

Technical Architecture

Use Cases

Contributing

License

Documentation

Architecture

Setup Instructions

Running the Framework

Approach and Design Decisions

Example Use Cases

Best Practices

Troubleshooting

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes