Skip to main content

Preference Optimization made easy

Project description

OpenPO 🐼

PyPI version License Documentation Python

Streamline LLM Preference Optimization through effortless human feedback collection.

Demo

What is OpenPO?

OpenPO is an open source library that simplifies the process of collecting, managing, and leveraging human feedback for LLM preference optimization. By automating the comparison of different LLM outputs and gathering human feedback, OpenPO helps developers build better, more fine-tuned language models with minimal effort.

Key Features

  • 🔌 Multiple LLM Support: Call any model from HuggingFace and OpenRouter, including popular models like GPT, Claude, Llama, and Mixtral

  • 🤝 OpenAI API Compatibility: Seamlessly integrate with OpenAI-style client APIs for easy migration and familiar developer experience

  • 💾 Flexible Storage: Pluggable adapters for your preferred datastore, supporting various data persistence options

  • 🎯 Fine-tuning Ready: Structured data output ready for immediate model fine-tuning and preference optimization

Installation

Install from PyPI (recommended)

OpenPO uses pip for installation. Run the following command in the terminal to install OpenPO:

pip install openpo

Install from source

Clone the repository first then run the follow command

cd openpo
poetry install

Getting Started

By default, OpenPO client utilizes HuggingFace's InferenceClient to call models available on HuggingFace Model Hub.

import os
from openpo.client import OpenPO

client = OpenPO(api_key="your-huggingface-api-key")

response = client.chat.completions.create_preference(
    model="Qwen/Qwen2.5-Coder-32B-Instruct",
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
    diff_frequency=0.5, # generate preference responses 50% of the time
)

print(response.choices[0].message.content)

OpenPO also works with OpenRouter.

# make request to OpenRouter
import os
from openpo.client import OpenPO

client = OpenPO(
    api_key='your-openrouter-api-key',
    base_url="https://openrouter.ai/api/v1/chat/completions"
)

response = client.chat.completions.create_preference(
    model= "qwen/qwen-2.5-coder-32b-instruct",
    message=[
        {"role": "system", "content": PROMPT},
        {"role": "user", "content": MESSAGE},
    ],
    diff_frequency=0.5
)

print(response.choices[0].message.content)

You can pass in a dictionary to pref_params argument to control the randomness of a second response when comparison logic is called. Currently supported parameters are: temperature and frequency_penalty

response = client.chat.completions.create_preference(
    model="Qwen/Qwen2.5-Coder-32B-Instruct",
    message=[
        {"role": "system", "content": PROMPT},
        {"role": "user", "content": MESSAGE},
    ],
    diff_frequency=0.5,
    pref_params={"temperature": 1.5, "frequency_penalty": 0.5},
)

Saving Data

Use providers to easily upload and download data.

import os
from openpo.client import OpenPO
from openpo.providers.huggingface import HuggingFaceStorage

storage = HuggingFaceStorage(repo_id="my-dataset-repo", api_key="hf-token")
client = OpenPO(api_key="your-huggingface-token")

preference = {} # preference data needs to be in the format {"prompt": ..., "preferred": ..., "rejected": ...} for finetuning
storage.save_data(data=preference, filename="my-data.json")

Structured Outputs (JSON Mode)

OpenPO supports structured outputs using Pydantic model.

[!NOTE] OpenRouter does not natively support structured outputs. This leads to inconsistent behavior from some models when structured output is used with OpenRouter.

It is recommended to use HuggingFace models for structured output.

from pydantic import BaseModel
from openpo.client import OpenPO

client = OpenPO(api_key="your-huggingface-api-key")

class ResponseModel(BaseModel):
    response: str


res = client.chat.completions.create_preference(
    model="Qwen/Qwen2.5-Coder-32B-Instruct",
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
    diff_frequency=0.5,
    response_format=ResponseModel,
)

Try Out

Set environment variable first

export HF_API_KEY=<your-api-key>

then run docker compose up --build to try demo in your localhost.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openpo-0.2.0.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

openpo-0.2.0-py3-none-any.whl (21.0 kB view details)

Uploaded Python 3

File details

Details for the file openpo-0.2.0.tar.gz.

File metadata

  • Download URL: openpo-0.2.0.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.0 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for openpo-0.2.0.tar.gz
Algorithm Hash digest
SHA256 041e0b3e8f0dc6e708bb682104b6961cbe95260aeb9f0a2590808046816708f7
MD5 3cc577619963839f8cab73caa289a0fd
BLAKE2b-256 9c829dbeef113efb44ec4649408f6199857fa4a733d5f06c867be7f7bc2e9fba

See more details on using hashes here.

File details

Details for the file openpo-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: openpo-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 21.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.0 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for openpo-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 85d14bf8b44ef48b918f33ab9aab1214dfae6f2bf3961c5c65c4a5def90d3bb4
MD5 cb0a3d1679c0a8d7c9c01b61dd4d0daa
BLAKE2b-256 5872203600775f73f138d99c838271500a75c61c752012f19b4f13c7241186c6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page