Check for data drift with OAI data
Project description
ft-drift
ft-drift
helps you check for data drift by comparing two OpenAI
multi-turn chat jsonl
files.
Install
pip install ft_drift
Background
Common situations where you want to check for dataset drift:
- You fine-tuned a new model but it doesn’t work the way you expect compared to a previous model trained on different data.
- Your model is trained on data that doesn’t reflect production.
In either situation, you can collect your data from the relevant sources and compare them to see if the data has changed in ways that are undesirable.
In the demo below, we detect data drift between two datasets where the following tokens were found to be different:
END-UI-FORMAT
UI-FORMAT
- “```json”
- etc.
Usage
After installing ft_drift
, the cli command detect_drift
will be
available to you.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ft-drift-0.0.3.tar.gz
(11.8 kB
view hashes)
Built Distribution
ft_drift-0.0.3-py3-none-any.whl
(11.9 kB
view hashes)