Opticonomy Prompt Driven Model Evaluation (PDME)
Project description
Opticonomy Prompt Driven Model Evaluation (PDME)
Step 1: Installation and Environment
Install Package
pip install opticonomy-pdme
Create and Activate the Virtual Environment
-
Set up a Python virtual environment and activate it (Linux):
python3 -m venv .venv source .venv/bin/activate -
Set up a Python virtual environment and activate it (Windows/VS Code / Bash):
python -m venv venv source venv/Scripts/activate -
Install dependencies from the
requirements.txtfile:pip install -r requirements.txt
Sample Use Cases
Storytelling
python pdme_client.py --eval_model openai/gpt-3.5-turbo-0125 --test_model openai-community/gpt2 --seed_1 "an old Englishman" --seed_2 "finding happiness" --seed_3 "rain" --seed_4 "old cars"
python pdme_client.py --eval_model openai/gpt-3.5-turbo-0125 --test_model distilbert/distilgpt2 --seed_1 "an old Englishman" --seed_2 "finding happiness" --seed_3 "rain" --seed_4 "old cars"
python pdme_client.py --eval_model openai/gpt-4o --test_model --test_model distilbert/distilgpt2 --seed_1 "an old Englishman" --seed_2 "finding happiness" --seed_3 "rain" --seed_4 "old cars"
python pdme_client.py --eval_model openai/gpt-4o --test_model openai-community/gpt2 --seed_1 "an old Englishman" --seed_2 "finding happiness" --seed_3 "rain" --seed_4 "old cars"
Overview
The method uses a single text generation AI, referred to as eval model, to evaluate any other text generation AI on any topic, and the evaluation works like this:
- We write a text prompt for what questions the eval model should generate, and provide seeds that are randomly picked to generate a question.
- The question is sent to the AI model being tested, and it generates a response.
- Likewise, the eval model also generates an answer to the same question.
- The eval model then uses a text prompt we write, to compare the two answers and pick the winner. (This model does not necessarily have to be the same as the eval model, but it does simplify inference)
This method allows us to evaluate models for any topic, such as: storytelling, programming, finance, and QnA.
Technical Description
See above for the installation and running instructions.
Example Use Case
Let’s say you want to evaluate a model's ability to write stories, PDME should be possible to use in the following way:
- Bootstrap Prompt - First generate a bootstrap prompt using random seeds, e.g.
(continue....)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file opticonomy-pdme-0.1.2.tar.gz.
File metadata
- Download URL: opticonomy-pdme-0.1.2.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
faebe1e1c38375aaf26d6cc620e3f5de3109500ea0b6881d1a41b2f7a846f245
|
|
| MD5 |
2e2cd8af8747c7fb26f9ec1556036ecc
|
|
| BLAKE2b-256 |
69b7948f215864930fb6eca6c85209fc26b63a11bb834bf8b1e06949e50564a2
|
File details
Details for the file opticonomy_pdme-0.1.2-py3-none-any.whl.
File metadata
- Download URL: opticonomy_pdme-0.1.2-py3-none-any.whl
- Upload date:
- Size: 10.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
526a1c1171fd114e32fbe46e35eb7ec708a23b6daf102fb399cc062a11c28f26
|
|
| MD5 |
5b9c07bf764470542e411d482f15ac96
|
|
| BLAKE2b-256 |
ffcb431c2853815cf6a0b1b096269cd698d0a2a73869783a6388e97888a613e2
|