Skip to main content

Preference Optimization made easy

Project description

OpenPO 🐼

PyPI version License Documentation Python Versions

Streamline LLM Preference Optimization through effortless human feedback collection.

Demo

What is OpenPO?

OpenPO is an open source library that simplifies the process of collecting, managing, and leveraging human feedback for LLM preference optimization. By automating the comparison of different LLM outputs and gathering human feedback, OpenPO helps developers build better, more fine-tuned language models with minimal effort.

Key Features

  • 🔌 Multiple LLM Support: Call any model from HuggingFace and OpenRouter, including popular models like GPT, Claude, Llama, and Mixtral

  • 🤝 OpenAI API Compatibility: Seamlessly integrate with OpenAI-style client APIs for easy migration and familiar developer experience

  • 💾 Flexible Storage: Pluggable adapters for your preferred datastore, supporting various data persistence options

  • 🎯 Fine-tuning Ready: Structured data output ready for immediate model fine-tuning and preference optimization

Installation

Install from PyPI (recommended)

OpenPO uses pip for installation. Run the following command in the terminal to install OpenPO:

pip install openpo

Install from source

Clone the repository first then run the follow command

cd openpo
poetry install

Getting Started

By default, OpenPO client utilizes HuggingFace's InferenceClient to call models available on HuggingFace Model Hub.

import os
from openpo.client import OpenPO

client = OpenPO(api_key="your-huggingface-api-key")

response = client.chat.completions.create_preference(
    model="mistralai/Mixtral-8x7B-Instruct-v0.1",
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
    diff_frequency=0.5, # generate comparison responses 50% of the time
)

print(response.choices[0].message.content)

OpenPO also works with OpenRouter.

# make request to OpenRouter
import os
from openpo.client import OpenPO

client = OpenPO(
    api_key='your-openrouter-api-key',
    base_url="https://openrouter.ai/api/v1/chat/completions"
)

response = client.chat.completions.create_preference(
    model="anthropic/claude-3.5-sonnet:beta",
    message=[
        {"role": "system", "content": PROMPT},
        {"role": "user", "content": MESSAGE},
    ],
    diff_frequency=0.5
)

print(response.choices[0].message.content)

You can pass in a dictionary to pref_params argument to control the randomness of a second response when comparison logic is called. Currently supported parameters are: temperature, frequency_penalty and presence_penalty.

response = client.chat.completions.create_preference(
    model="anthropic/claude-3.5-sonnet:beta",
    message=[
        {"role": "system", "content": PROMPT},
        {"role": "user", "content": MESSAGE},
    ],
    diff_frequency=0.5,
    pref_params={"temperature": 1.5, "frequency_penalty": 0.5},
)

Saving Data

Use providers to easily upload and download data.

import os
from openpo.client import OpenPO
from openpo.providers.huggingface import HuggingFaceStorage

storage = HuggingFaceStorage(repo_id="my-dataset-repo", api_key="hf-token")
client = OpenPO(api_key="your-huggingface-token")

preference = {} # preference data needs to be in the format {"prompt": ..., "preferred": ..., "rejected": ...} for finetuning
storage.save_data(data=preference, key="my-data")

Structured Outputs (JSON Mode)

OpenPO supports structured outputs using Pydantic model.

[!NOTE] OpenRouter does not natively support structured outputs. This leads to inconsistent behavior from some models when structured output is used with OpenRouter.

It is recommended to use HuggingFace models for structured output.

from pydantic import BaseModel
from openpo.client import OpenPO

client = OpenPO(api_key="your-huggingface-api-key")

class ResponseModel(BaseModel):
    response: str


res = client.chat.completions.create_preference(
    model='mistralai/Mixtral-8x7B-Instruct-v0.1',
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
    diff_frequency=0.5,
    response_format=ResponseModel,
)

Try Out

docker compose up --build to run simple demo of how it works in the UI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openpo-0.1.2.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

openpo-0.1.2-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file openpo-0.1.2.tar.gz.

File metadata

  • Download URL: openpo-0.1.2.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.0 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for openpo-0.1.2.tar.gz
Algorithm Hash digest
SHA256 38c85cee29805ad87f299010474430cede18f127000b11005b2259ab659c0b22
MD5 fdf743ae097a964a6d3745632b9cdad1
BLAKE2b-256 490e234cab087bed0d51f629756770c860c3f0031601cfd972c8ab930e670446

See more details on using hashes here.

File details

Details for the file openpo-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: openpo-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.0 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for openpo-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a3796630ed050bde5fbd611803b3510cd84af2068f3891cfd41ea065b51b38aa
MD5 2ed59d08e49213f62fa3e289d5de10e0
BLAKE2b-256 7516797bd54b55457b4cede971f24abd82ed490dcb825bd8eff0771650ac1e9e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page