onPanda Python package
Project description
onpanda: The Companion Python Package for onPanda
Contents: Features | Install | Example Data | Quick Start | Main Modules | Iterative Correction API | Data Assumptions
▮ Features
- Parse
.panda.jsoninto SFT and preference-pair data (build_legacy_data_v1) - Build token-level supervision data (
build_token_level_supervision_data_v1/v2) - Build Find-and-Replace correction training data (
build_far_correction_data_v1) - Verify and score Find-and-Replace outputs (
FindAndReplaceVerifier) - Run iterative correction as a Proxy API (
onpanda.server.iterative_correction_api) - Build
panda battledata from two arena result sets (build_panda_battle)
▮ Install
pip install onpanda -U
# Or want to run demos.
git clone https://github.com/on-panda/onpanda.git
cd onpanda
pip install -e .
If you want to use tokenizers, install transformers separately.
Example Data
on-panda-example-data is the example dataset repo for this project:
git clone https://github.com/on-panda/on-panda-example-data.git ../on-panda-example-data
ls ../on-panda-example-data/panda_json/
▮ Quick Start
import onpanda
panda_path = (
"../on-panda-example-data/panda_json/"
"2025-08-19_how-many-1s_tokenizer-Qwen2.5.panda.json"
)
tokenizer=onpanda.utf8_tokenizer
# Use built-in utf8_tokenizer for a minimal runnable flow.
tree = onpanda.PandaTree(panda_path, tokenizer)
# 1) SFT + preference pairs
legacy = tree.build_legacy_data_v1()
print("sfts:", len(legacy["sfts"]))
print("preferences:", len(legacy["preferences"]))
# 2) Token-level supervision
token_level_v1 = tree.build_token_level_supervision_data_v1(
tokenizer
)
print("token_level_v1:", len(token_level_v1))
# 3) Find-and-Replace correction data
adapter = onpanda.FindAndReplaceCorrectionAdapter(
tokenizer
)
correction_data = tree.build_far_correction_data_v1(adapter)
print("correction_data:", len(correction_data))
Build from plain chat messages:
import onpanda
messages = [
{"role": "user", "content": "5+7=?"},
{"role": "assistant", "content": "12"},
]
panda_json = onpanda.messages_to_panda_tree(messages, uuid="demo")
# dump to xxx.panda.json
▮ Main Modules
onpanda/parser.py:PandaTreeand data conversion entrypointsonpanda/token_level_supervision_utils.py: token-level patch extraction and masksonpanda/correcting_model/far_correction_utils.py: FAR data builder and apply logiconpanda/correcting_model/verifier.py: FAR parser/locator/reward computationonpanda/correcting_model/correcting_model.py: iterative correction workflowonpanda/server/iterative_correction_api.py: Flask wrapper for correction serviceonpanda/arena/panda_battle.py: build battle-style comparison data
▮ Iterative Correction API
Launch a proxy API server that return response using iterative_correction
python -m onpanda.server.iterative_correction_api --help
▮ Data Assumptions
PandaTreeis a parser for qualified, annotated Panda JSON.PandaTreepreprocessing currently assumes:- Top-level field
dialogsexists - Top-level field
update_timeexists - At least one dialog ends with an
assistantmessage - If
annotate.is_goodis missing, latest dialog is treated as default good
- Top-level field
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
onpanda-0.1.2.tar.gz
(47.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
onpanda-0.1.2-py3-none-any.whl
(51.8 kB
view details)
File details
Details for the file onpanda-0.1.2.tar.gz.
File metadata
- Download URL: onpanda-0.1.2.tar.gz
- Upload date:
- Size: 47.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c96250be4a1f70c877ba6b1d3a9876146d7cf69a1adae079a9011ad0fc86f6b
|
|
| MD5 |
a9a8a2489e78718e69175d426dae8957
|
|
| BLAKE2b-256 |
d614ce98bf322655acc7ceb2f8e7a54970da1bd91130aad2c0ae454a781bf4d4
|
File details
Details for the file onpanda-0.1.2-py3-none-any.whl.
File metadata
- Download URL: onpanda-0.1.2-py3-none-any.whl
- Upload date:
- Size: 51.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd9256b7e796e92a86ac7c3d1ecd98ac3f2b382ca04575789fec91c567530ca2
|
|
| MD5 |
afe500b1d60403d880aaa3721dca5cf6
|
|
| BLAKE2b-256 |
241804e2793f5c3a1372142f4bfa744ffa7570bf1b5c7f97029d78f7bad7e06d
|