onPanda Python package
Project description
onpanda: The Companion Python Package for onPanda
Contents: Features | Install | Example Data | Quick Start | Main Modules | Iterative Correction API | Data Assumptions
▮ Features
- Parse
.panda.jsoninto SFT and preference-pair data (build_legacy_data_v1) - Build token-level supervision data (
build_token_level_supervision_data_v1/v2) - Build Find-and-Replace correction training data (
build_far_correction_data_v1) - Verify and score Find-and-Replace outputs (
FindAndReplaceVerifier) - Run iterative correction as a Proxy API (
onpanda.server.iterative_correction_api) - Build
panda battledata from two arena result sets (build_panda_battle)
▮ Install
pip install onpanda -U
# Or want to run demos.
git clone https://github.com/on-panda/onpanda.git
cd onpanda
pip install -e .
If you want to use tokenizers, install transformers separately.
Example Data
on-panda-example-data is the example dataset repo for this project:
git clone https://github.com/on-panda/on-panda-example-data.git ../on-panda-example-data
ls ../on-panda-example-data/panda_json/
▮ Quick Start
import onpanda
panda_path = (
"../on-panda-example-data/panda_json/"
"2025-08-19_how-many-1s_tokenizer-Qwen2.5.panda.json"
)
tokenizer=onpanda.unicode_tokenizer
# Use built-in unicode_tokenizer for a minimal runnable flow.
tree = onpanda.PandaTree(panda_path, tokenizer)
# 1) SFT + preference pairs
legacy = tree.build_legacy_data_v1()
print("sfts:", len(legacy["sfts"]))
print("preferences:", len(legacy["preferences"]))
# 2) Token-level supervision
token_level_v1 = tree.build_token_level_supervision_data_v1(
tokenizer
)
print("token_level_v1:", len(token_level_v1))
# 3) Find-and-Replace correction data
adapter = onpanda.FindAndReplaceCorrectionAdapter(
tokenizer
)
correction_data = tree.build_far_correction_data_v1(adapter)
print("correction_data:", len(correction_data))
Build from plain chat messages:
import onpanda
messages = [
{"role": "user", "content": "5+7=?"},
{"role": "assistant", "content": "12"},
]
panda_json = onpanda.messages_to_panda_tree(messages, uuid="demo")
# dump to xxx.panda.json
▮ Main Modules
onpanda/parser.py:PandaTreeand data conversion entrypointsonpanda/token_level_supervision_utils.py: token-level patch extraction and masksonpanda/correcting_model/far_correction_utils.py: FAR data builder and apply logiconpanda/correcting_model/verifier.py: FAR parser/locator/reward computationonpanda/correcting_model/correcting_model.py: iterative correction workflowonpanda/server/iterative_correction_api.py: Flask wrapper for correction serviceonpanda/arena/panda_battle.py: build battle-style comparison data
▮ Iterative Correction API
Launch a proxy API server that return response using iterative_correction
python -m onpanda.server.iterative_correction_api --help
▮ Data Assumptions
PandaTreeis a parser for qualified, annotated Panda JSON.PandaTreepreprocessing currently assumes:- Top-level field
dialogsexists - Top-level field
update_timeexists - At least one dialog ends with an
assistantmessage - If
annotate.is_goodis missing, latest dialog is treated as default good
- Top-level field
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
onpanda-0.1.1.tar.gz
(47.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
onpanda-0.1.1-py3-none-any.whl
(51.4 kB
view details)
File details
Details for the file onpanda-0.1.1.tar.gz.
File metadata
- Download URL: onpanda-0.1.1.tar.gz
- Upload date:
- Size: 47.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3bc65680a5eb9143be1ea9223601bb69f27019081ff5633db134f606caad60e
|
|
| MD5 |
ce6d6d0546cc6a3450adc4fadd0f0bcb
|
|
| BLAKE2b-256 |
8a128d8daf58437064442318d601dfa6e35dcf3f66f62e59ca0b5f89ce467929
|
File details
Details for the file onpanda-0.1.1-py3-none-any.whl.
File metadata
- Download URL: onpanda-0.1.1-py3-none-any.whl
- Upload date:
- Size: 51.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9cf3d5598b9f83f0ebf9706ee05424941dd06c26e6f205795f02fbd984a429f0
|
|
| MD5 |
6801750dce2b39a40cc42759bad4ca9e
|
|
| BLAKE2b-256 |
1129b69b82f5c5427a9b76d2f297cc1239d80201ac0d4e7272efdb043f9eadc4
|