AI pipeline framework for Python.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

LLMonPy

LLMonPy is a python library that aims to make it easy to build AI systems with mixtures-of-agents response generators, mixture-of-agent as judge and synthetic data for ICL. The typical python program that uses LLMonPy will use teams of models to generate responses, then use another team to rank the responses and the best responses are used as examples to improve the quality of the next round or response generation. The ranking process also generates a lot of question/best answer/worse answer (QBaWa) data that can be used for fine-tuning models.

Getting Started

Setup Virtual Environment

I recommend setting up a virtual environment to isolate Python dependencies.

python3 -m venv .venv
source .venv/bin/activate

Install Package

Install the package from PyPi - this takes awhile because it also installs the python clients of multiple LLMs:

pip install llmonpy

Environment Variables

LLMonPy uses a lot of models, so you will probably need several API keys to use it. On startup, it looks for these keys and initializes the clients for associated models for the keys it finds. The following environment variables are used:

First Choice	Second Choice
`LLMONPY_OPENAI_API_KEY`	`OPENAI_API_KEY`
`LLMONPY_ANTHROPIC_API_KEY`	`ANTHROPIC_API_KEY`
`LLMONPY_MISTRAL_API_KEY`	`MISTRAL_API_KEY`
`LLMONPY_GEMINI_API_KEY`	`GEMINI_API_KEY`
`LLMONPY_FIREWORKS_API_KEY`	`FIREWORKS_API_KEY`

Testing Setup

To determine what models are available, this command will list the models that are available:

llmonpy models

To test basic prompting, you can use the following command:

llmonpy prompt

Tourneys, Cycles, and GARs require several models to be available. To make this easier for new users, these tests just require the FIREWORKS_API_KEY and OPENAI_API_KEY. I chose Fireworks.ai and OpenAI as the default providers because they have a large number of fast, high-quality models available.

To test the tourneys, you can use the following command (it will cost about $0.03 or less):

llmonpy tourney

Note: If you are using google clients, you will get this error message: "WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1722383946.306228 139136289 config.cc:230] gRPC experiments enabled: call_status", followed by 10ish lines of error messages. This is apparently a bug in the gRPC client, but it does not seem to affect the results.

To test the AdaptiveICLCycle, you can use the following command (it will cost about $0.9 or less):

llmonpy cycle

To test mixture of agents, what I call GenerateAggregateRank (GAR) you can use the following command (it will cost about $0.12 or less):

llmonpy gar

Creating Prompts

Prompts are classes that inherit from the LLMonPyPrompt class. The class defines the prompt_text and the data that is used to render the prompt (the prompt_text is used in a Jinja2 template). The class must also define a LLMonPyOutput nested class that defines the output of the prompt. If the prompt is used in a tourney, the class must also define a JudgePrompt nested class that inherits from the TournamentJudgePrompt class.

prompt_text: Class field that is the text used for a Jinja2 template
constructor: The constructor defines the data that is used to render the prompt
to_dict: Method that returns a dictionary of the data that is used to render the prompt and to store the input data in the trace
LLMonPyOutput: Nested class that defines the output of the prompt
JudgePrompt: Inherits from TournamentJudgePrompt and is used to rank 2 outputs from the prompt. The output of a JudgePrompt is always TournamentJudgePrompt.LLMonPyOutput

You can see an example of a prompt in the steps_prompt.py file. If a prompt is used in an AdaptiveICLCycle, the prompt will probably include {% if example_list %}. Example_list is a list of examples of good responses. The data are instances of the LLMonPyOutput class.

Tourneys

Tourneys are LLMonPySteps that use the LLMonPyTournament class to have multiple LLMs generate responses to a prompt. It then ranks the responses by using LLM judges to compare each output against every other output. The winner of the tourney is the response with the most victories. There is an example of a tourney in steps_prompt.py file.

AdaptiveICLCycle

The AdaptiveICLCycle use a tourney to generate a list of examples of good responses. It then re-runs the tourney with the good responses used as examples to improve the quality of the responses. The cycle continues until it reaches a limit you set or the responses have stopped improving. There is an example of a cycle in steps_prompt.py file.

GenerateAggregateRank (GAR)

The GenerateAggregateRank (GAR) is a mixture of agents that uses a list of "generator" models to generate responses a first round of responses. The first rounds of responses are used as examples for the next round of responses. That process is repeated for the number of rounds you set. The last round of responses are ranked by a list of "judge" models.

Trace Viewer

LLMonPy has a trace viewer that can help you understand how your pipeline is working. To start it use this command:

llmonpy_viewer

The UI is at http://localhost:2304 and looks like this:

The trace viewer lets you see the "Victory Report" of tourneys and cycles. It also lets you see the input data, output data, the logs for each step and the sub-steps of each step. The Victory Report divides the total cost of each model's responses by the number of victories the model had in one on one battles. The trace data is stored in the "data" directory in your project directory. It is stored in a SQLite database.

Training Data (QBaWa)

LLMonPy stores the results of all the one on one battles used to rank the responses in a tourney. This data can be used to fine-tune models. It is indexed by the name of the prompt class. To get a list of prompts with training data, use the following command:

llmonpy qbawa_list

This command will return a list of prompt names. To get the actual training data as JSON, use the following command:

llmonpy qbawa -name=<prompt_name>

You can learn more about the training data here.

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.1.35

Jan 30, 2025

0.1.34

Sep 16, 2024

0.1.33

Sep 15, 2024

0.1.32

Sep 15, 2024

0.1.31

Sep 15, 2024

0.1.30

Sep 5, 2024

0.1.29

Sep 3, 2024

0.1.28

Aug 14, 2024

0.1.27

Aug 14, 2024

0.1.26

Aug 13, 2024

0.1.25

Aug 13, 2024

0.1.24

Aug 13, 2024

0.1.23

Aug 13, 2024

0.1.22

Aug 13, 2024

0.1.21

Aug 12, 2024

0.1.19

Aug 9, 2024

0.1.18

Aug 8, 2024

0.1.17

Aug 7, 2024

0.1.16

Jul 31, 2024

0.1.14

Jul 30, 2024

0.1.13

Jul 24, 2024

0.1.12

Jul 24, 2024

0.1.10

Jul 24, 2024

0.1.9

Jul 24, 2024

0.1.8

Jul 24, 2024

0.1.7

Jul 24, 2024

0.1.6

Jul 24, 2024

0.1.5

Jul 23, 2024

0.1.4

Jul 23, 2024

0.1.3

Jul 23, 2024

0.1.0

Jul 23, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmonpy-0.1.35.tar.gz (2.3 MB view details)

Uploaded Jan 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmonpy-0.1.35-py3-none-any.whl (2.4 MB view details)

Uploaded Jan 30, 2025 Python 3

File details

Details for the file llmonpy-0.1.35.tar.gz.

File metadata

Download URL: llmonpy-0.1.35.tar.gz
Upload date: Jan 30, 2025
Size: 2.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.0

File hashes

Hashes for llmonpy-0.1.35.tar.gz
Algorithm	Hash digest
SHA256	`99ae0b56162c4bb622ddfce9f5f5e3d7a1b224be844459e98255b860ba5a7504`
MD5	`ad8f935c7c5b5a60c4d237a7204b0f57`
BLAKE2b-256	`8613ded552961bf9aaef114a8787cda48d5f4630ef6696682838c04a00f7dde1`

See more details on using hashes here.

File details

Details for the file llmonpy-0.1.35-py3-none-any.whl.

File metadata

Download URL: llmonpy-0.1.35-py3-none-any.whl
Upload date: Jan 30, 2025
Size: 2.4 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.0

File hashes

Hashes for llmonpy-0.1.35-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a9ada5195ee50997493d67a5cefc790b46609d9e51fd54c760d77a8bf534294e`
MD5	`8cecc5b067ecad64f65ea63d550aae45`
BLAKE2b-256	`fb16631cd2a35b484132620626cb3bf330bd5e083770b35f5efafad4e0ae9584`

See more details on using hashes here.

llmonpy 0.1.35

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

LLMonPy

Getting Started

Setup Virtual Environment

Install Package

Environment Variables

Testing Setup

Creating Prompts

Tourneys

AdaptiveICLCycle

GenerateAggregateRank (GAR)

Trace Viewer

Training Data (QBaWa)

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes