Build high quality synthetic datasets with AI feedback from 200+ LLMs

These details have not been verified by PyPI

Project links

Project description

OpenPO 🐼

Python

OpenPO simplifies building synthetic datasets for preference tuning from 200+ LLMs.

Resources	Notebooks
Building dataset with OpenPO and PairRM	📔 Notebook

What is OpenPO?

OpenPO is an open source library that simplifies the process of building synthetic datasets for LLM preference tuning. By collecting outputs from 200 + LLMs and evaluating them using research-proven methodologies, OpenPO helps developers build better, more fine-tuned language models with minimal effort.

Key Features

🔌 Multiple LLM Support: Call 200+ models from HuggingFace and OpenRouter
🧪 Research-Backed Methodologies: Implementation of methodologies for data synthesis from latest research papers.
🤝 OpenAI API Compatibility: Support for OpenAI API format
💾 Flexible Storage: Out of the box storage providers for HuggingFace and S3.

Installation

Install from PyPI (recommended)

OpenPO uses pip for installation. Run the following command in the terminal to install OpenPO:

pip install openpo

Install from source

Clone the repository first then run the follow command

cd openpo
poetry install

Getting Started

set environment variable first

export HF_API_KEY=<your-api-key>
export OPENROUTER_API_KEY=<your-api-key>

To get started, simply pass in a list of model names of your choice

[!NOTE] OpenPo requires provider name to be prepended to the model identifier.

import os
from openpo.client import OpenPO

client = OpenPO()

response = client.completions(
    models = [
        "huggingface/Qwen/Qwen2.5-Coder-32B-Instruct",
        "huggingface/mistralai/Mistral-7B-Instruct-v0.3",
        "huggingface/microsoft/Phi-3.5-mini-instruct",
    ],
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
)

You can also call models with OpenPO.

# make request to OpenRouter
client = OpenPO()

response = client.completions(
    models = [
        "openrouter/qwen/qwen-2.5-coder-32b-instruct",
        "openrouter/mistralai/mistral-7b-instruct-v0.3",
        "openrouter/microsoft/phi-3.5-mini-128k-instruct",
    ],
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],

)

OpenPO takes default model parameters as a dictionary. Take a look at the documentation for more detail.

response = client.completions(
    models = [
        "huggingface/Qwen/Qwen2.5-Coder-32B-Instruct",
        "huggingface/mistralai/Mistral-7B-Instruct-v0.3",
        "huggingface/microsoft/Phi-3.5-mini-instruct",
    ],
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
    params={
        "max_tokens": 500,
        "temperature": 1.0,
    }
)

Storing Data

Use out of the box storage class to easily upload and download data.

from openpo.storage.huggingface import HuggingFaceStorage
hf_storage = HuggingFaceStorage(repo_id="my-dataset-repo")

# push data to repo
preference = {"prompt": "text", "preferred": "response1", "rejected": "response2"}
hf_storage.push_to_repo(data=preference)

# Load data from repo
data = hf_storage.load_from_repo()

Structured Outputs (JSON Mode)

OpenPO supports structured outputs using Pydantic model.

[!NOTE] OpenRouter does not natively support structured outputs. This leads to inconsistent behavior from some models when structured output is used with OpenRouter.

It is recommended to use HuggingFace models for structured output.

from pydantic import BaseModel
from openpo.client import OpenPO

client = OpenPO()

class ResponseModel(BaseModel):
    response: str


res = client.completions(
    models=["huggingface/Qwen/Qwen2.5-Coder-32B-Instruct"],
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
    params = {
        "response_format": ResponseFormat,
    }
)

Contributing

Contributions are what makes open source amazingly special! Here's how you can help:

Development Setup

Clone the repository

git clone https://github.com/yourusername/openpo.git
cd openpo

Install Poetry (dependency management tool)

curl -sSL https://install.python-poetry.org | python3 -

Install dependencies

poetry install

Development Workflow

Create a new branch for your feature

git checkout -b feature-name

Submit a Pull Request

Write a clear description of your changes
Reference any related issues

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.7.7

Dec 26, 2024

0.7.6

Dec 25, 2024

0.7.5

Dec 25, 2024

0.7.4

Dec 25, 2024

0.7.3

Dec 25, 2024

0.7.2

Dec 25, 2024

0.7.1

Dec 25, 2024

0.7.0

Dec 25, 2024

0.6.5

Dec 21, 2024

0.6.4

Dec 21, 2024

0.6.3

Dec 21, 2024

0.6.2

Dec 20, 2024

0.6.1

Dec 20, 2024

0.6.0

Dec 18, 2024

0.5.13

Dec 16, 2024

0.5.12

Dec 16, 2024

0.5.11

Dec 16, 2024

0.5.10

Dec 16, 2024

0.5.9

Dec 15, 2024

0.5.8

Dec 15, 2024

0.5.7

Dec 15, 2024

0.5.6

Dec 15, 2024

0.5.5

Dec 14, 2024

0.5.4

Dec 14, 2024

0.5.3

Dec 13, 2024

0.5.2

Dec 13, 2024

0.5.1

Dec 13, 2024

0.5.0

Dec 13, 2024

This version

0.4.2

Dec 11, 2024

0.4.1

Dec 10, 2024

0.4.0

Dec 6, 2024

0.3.0

Dec 3, 2024

0.2.0

Nov 26, 2024

0.1.2

Nov 25, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openpo-0.4.2.tar.gz (17.8 kB view details)

Uploaded Dec 11, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openpo-0.4.2-py3-none-any.whl (28.4 kB view details)

Uploaded Dec 11, 2024 Python 3

File details

Details for the file openpo-0.4.2.tar.gz.

File metadata

Download URL: openpo-0.4.2.tar.gz
Upload date: Dec 11, 2024
Size: 17.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.0 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for openpo-0.4.2.tar.gz
Algorithm	Hash digest
SHA256	`33d2c5aba32068abb81815913b2bd5d8f5d069b81e6902821c5b9476bef0924f`
MD5	`f00e7fc27e686699fff4c23bc1020f51`
BLAKE2b-256	`446d2bea1d183ee402e0095e33165833b98de5ba27261aca12e44458d85c7556`

See more details on using hashes here.

File details

Details for the file openpo-0.4.2-py3-none-any.whl.

File metadata

Download URL: openpo-0.4.2-py3-none-any.whl
Upload date: Dec 11, 2024
Size: 28.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.0 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for openpo-0.4.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`87b0407187bc7d5893ee40ff4247e37bc495e06ddc8991f780877454695ee4cd`
MD5	`2984b71415f6a96d6388e424800d64ea`
BLAKE2b-256	`b2384aac0aa4b16ac5376de0c0ed3d35327a9bddefe0f94db91d4c4302355738`

See more details on using hashes here.

openpo 0.4.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OpenPO 🐼

What is OpenPO?

Key Features

Installation

Install from PyPI (recommended)

Install from source

Getting Started

Storing Data

Structured Outputs (JSON Mode)

Contributing

Development Setup

Development Workflow

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes