Build high quality synthetic datasets with AI feedback from 200+ LLMs

These details have not been verified by PyPI

Project links

Project description

OpenPO 🐼

Python

OpenPO simplifies building synthetic datasets for preference tuning from 200+ LLMs.

What is OpenPO?

OpenPO is an open source library that simplifies the process of building synthetic datasets for LLM preference tuning. By collecting outputs from 200 + LLMs and ranking them using various techniques, OpenPO helps developers build better, more fine-tuned language models with minimal effort.

Key Features

🔌 Multiple LLM Support: Call 200+ models from HuggingFace and OpenRouter
🧪 Research-Backed Methodologies: Implementation of various methodologies on data synthesis from latest research papers. (feature coming soon!)
🤝 OpenAI API Compatibility: Fully support OpenAI API format
💾 Flexible Storage: Out of the box storage providers for Hugging Face and S3.

Installation

Install from PyPI (recommended)

OpenPO uses pip for installation. Run the following command in the terminal to install OpenPO:

pip install openpo

Install from source

Clone the repository first then run the follow command

cd openpo
poetry install

Getting Started

OpenPO defaults to Hugging Face when provider argument is not set.

import os
from openpo.client import OpenPO

client = OpenPO(api_key="your-huggingface-api-key") # no need to pass in the key if environment variable is already set.

response = client.completions(
    models = [
        "Qwen/Qwen2.5-Coder-32B-Instruct",
        "mistralai/Mistral-7B-Instruct-v0.3",
        "microsoft/Phi-3.5-mini-instruct",
    ],
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
)

To use with OpenRouter, set the provider to openrouter

# make request to OpenRouter
client = OpenPO(api_key="<your-openrouter-api-key", provider='openrouter')

response = client.completions(
    models = [
        "qwen/qwen-2.5-coder-32b-instruct",
        "mistralai/mistral-7b-instruct-v0.3",
        "microsoft/phi-3.5-mini-128k-instruct",
    ],
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],

)

OpenPO takes default model parameters as a dictionary. Take a look at the documentation for more detail.

response = client.completions(
    models = [
        "Qwen/Qwen2.5-Coder-32B-Instruct",
        "mistralai/Mistral-7B-Instruct-v0.3",
        "microsoft/Phi-3.5-mini-instruct",
    ],
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
    params={
        "max_tokens": 500,
        "temperature": 1.0,
    }
)

Saving Data

Use out of the box storage class to easily upload and download data.

import os
from openpo.client import OpenPO
from openpo.storage.huggingface import HuggingFaceStorage

storage = HuggingFaceStorage(repo_id="my-dataset-repo", api_key="hf-token")
client = OpenPO(api_key="your-huggingface-token")

preference = {} # preference data needs to be in the format {"prompt": ..., "preferred": ..., "rejected": ...} for finetuning
storage.push_to_hub(data=preference, filename="my-data.json")

Structured Outputs (JSON Mode)

OpenPO supports structured outputs using Pydantic model.

[!NOTE] OpenRouter does not natively support structured outputs. This leads to inconsistent behavior from some models when structured output is used with OpenRouter.

It is recommended to use HuggingFace models for structured output.

from pydantic import BaseModel
from openpo.client import OpenPO

client = OpenPO(api_key="your-huggingface-api-key")

class ResponseModel(BaseModel):
    response: str


res = client.completions(
    models=["Qwen/Qwen2.5-Coder-32B-Instruct"],
    messages=[
        {"role": "system", "content": PROMPT},
        {"role": "system", "content": MESSAGE},
    ],
    params = {
        "response_format": ResponseFormat,
    }
)

Contributing

Contributions are what makes open source amazingly special! Here's how you can help:

Development Setup

Fork and clone the repository

git clone https://github.com/yourusername/openpo.git
cd openpo

Install Poetry (dependency management tool)

curl -sSL https://install.python-poetry.org | python3 -

Install dependencies

poetry install

Development Workflow

Create a new branch for your feature

git checkout -b feature-name

Submit a Pull Request

Write a clear description of your changes
Reference any related issues

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.7.7

Dec 26, 2024

0.7.6

Dec 25, 2024

0.7.5

Dec 25, 2024

0.7.4

Dec 25, 2024

0.7.3

Dec 25, 2024

0.7.2

Dec 25, 2024

0.7.1

Dec 25, 2024

0.7.0

Dec 25, 2024

0.6.5

Dec 21, 2024

0.6.4

Dec 21, 2024

0.6.3

Dec 21, 2024

0.6.2

Dec 20, 2024

0.6.1

Dec 20, 2024

0.6.0

Dec 18, 2024

0.5.13

Dec 16, 2024

0.5.12

Dec 16, 2024

0.5.11

Dec 16, 2024

0.5.10

Dec 16, 2024

0.5.9

Dec 15, 2024

0.5.8

Dec 15, 2024

0.5.7

Dec 15, 2024

0.5.6

Dec 15, 2024

0.5.5

Dec 14, 2024

0.5.4

Dec 14, 2024

0.5.3

Dec 13, 2024

0.5.2

Dec 13, 2024

0.5.1

Dec 13, 2024

0.5.0

Dec 13, 2024

0.4.2

Dec 11, 2024

0.4.1

Dec 10, 2024

0.4.0

Dec 6, 2024

This version

0.3.0

Dec 3, 2024

0.2.0

Nov 26, 2024

0.1.2

Nov 25, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openpo-0.3.0.tar.gz (14.4 kB view details)

Uploaded Dec 3, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openpo-0.3.0-py3-none-any.whl (23.6 kB view details)

Uploaded Dec 3, 2024 Python 3

File details

Details for the file openpo-0.3.0.tar.gz.

File metadata

Download URL: openpo-0.3.0.tar.gz
Upload date: Dec 3, 2024
Size: 14.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.0 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for openpo-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`acbd0976b8cf7963d5881c062f0c1f77ea19a31f58ce15e39ee11b385855c7c5`
MD5	`4e82e9585ea74371c53456edee407897`
BLAKE2b-256	`70755bcb4af109eabfa927ab0cd329d40573385d203783f8a2174f2a481b01c7`

See more details on using hashes here.

File details

Details for the file openpo-0.3.0-py3-none-any.whl.

File metadata

Download URL: openpo-0.3.0-py3-none-any.whl
Upload date: Dec 3, 2024
Size: 23.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.0 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for openpo-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`30a8924723b97db7b39201cd5d13349d7a16805bbeed0ab5987b5aaff4229777`
MD5	`dff99cb20710583edacb3b62112fc7e7`
BLAKE2b-256	`45112d1c040cb7380a83d87257424f25edcc1c219b66cba99a1e17c739e6f284`

See more details on using hashes here.

openpo 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OpenPO 🐼

What is OpenPO?

Key Features

Installation

Install from PyPI (recommended)

Install from source

Getting Started

Saving Data

Structured Outputs (JSON Mode)

Contributing

Development Setup

Development Workflow

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes