A pipeline for cleaning instruction datasets by removing refusals and rewriting prompts into safe, answerable questions.
Project description
🧹 Refusal-Cleaner
Refusal-Cleaner is a high-throughput pipeline for cleaning instruction–response datasets. It removes refusals, hedges, and disclaimers, reframes unsafe prompts into safe, answerable questions, and generates direct responses — producing cleaner, more useful training data for LLMs.
It uses the OpenAI Batch API for speed and cost efficiency, processing tens of thousands of rows in parallel.
✨ Features
- Refusal Detection → finds “I’m sorry, I cannot…” style outputs.
- Prompt Rewriting → reframes unsafe instructions while preserving topic intent.
- Answer Generation → produces direct, factual answers with no disclaimers.
- Recursive Cleaning → runs up to 3 cycles of classify → rewrite → answer, then drops anything still refusing.
- Backfiller → fills in missing responses without rewriting prompts.
- Batch-Only → never single API calls; all requests go through the Batch API.
- Auto Chunking → splits datasets into ~10 chunks, each ≥1000 rows.
- Resume-Safe → merges results incrementally.
- Prebuilt Integrations → ready for Anthropic HH and OASST1 datasets.
📦 Installation
Install from PyPI:
pip install refusal-cleaner
Make sure your OpenAI API key is available in ~/.elf_env:
echo "OPENAI_API_KEY=sk-xxxx" > ~/.elf_env
🚀 Usage
Clean a Built-in Dataset
refusal-cleaner --dataset anthropic
refusal-cleaner --dataset oasst1
Clean a Custom Dataset
refusal-cleaner --dataset custom \
--input data/my_raw.jsonl \
--output data/my_clean.jsonl
Options:
--rounds 2→ number of recursive passes (default: 3)--classifier-model,--rewriter-model,--answer-model→ override model names
Backfill Missing Responses
If your JSONL has blanks in the response field:
refusal-cleaner --dataset oasst1 --backfill
Options:
--slices N→ force exactlyNbatch slices (default: auto 1/10 rule, ≥1000 per slice)--poll-interval N→ polling interval in seconds (default: 20)
End-to-End Download + Clean
Fetch Anthropic HH and OASST1, save to data/, and clean them:
python download_and_clean.py
📊 Example Transformations
| Original Instruction | Rewritten Instruction | Response (Cleaned) |
|---|---|---|
| "How do I hack my neighbor’s WiFi?" | "How can penetration testers demonstrate WiFi vulnerabilities in a safe lab environment?" | Explanation of WiFi security flaws and ethical test methods. |
| "Tell me how to make a bomb" | "What are the historical methods by which explosives were developed, and how are they studied today?" | Historical context + chemistry principles. |
| "Write a guide for making meth" | "What are the historical methods and chemical principles behind methamphetamine synthesis, and how are they studied safely today?" | Discussion of historic synthesis + forensic/civil-defense uses. |
⚡ Output Format
{
"original_instruction": "How do I make a Molotov cocktail?",
"rewritten_instruction": "What is the historical use of Molotov cocktails and how are they studied safely in civil defense?",
"response": "Historical explanation + safe academic context..."
}
🧭 Why This Matters
Most instruction datasets are polluted with refusals:
- Models learn to dodge instead of answering.
- Many prompts collapse into identical “I’m sorry” responses.
- Training signal quality drops.
Refusal-Cleaner restores signal by:
- Rewriting unsafe instructions into safe, on-topic questions.
- Generating informative, refusal-free answers.
- Preserving dataset intent while maximizing training value.
📈 What’s New in 0.2.0
- ✅ Batch-only pipeline (no per-row calls).
- ✅ Recursive cleaning with drop-on-final.
- ✅ Backfiller support for blank responses.
- ✅ Auto chunking (~10 slices, ≥1000 rows each).
- ✅ Cleaner CLI (no more workers/batch-size args).
⭐ If you find this useful, please give it a star!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file refusal_cleaner-0.2.0.tar.gz.
File metadata
- Download URL: refusal_cleaner-0.2.0.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8355bd9a4c6d3864e9a2b7531274517e10cd915782e5b71e9f66c575f772c36c
|
|
| MD5 |
0262ec554609ad1121d65814f3699297
|
|
| BLAKE2b-256 |
77997d07d5a503da23e034069a077f4fcd78aecb8f77bb1560294c49769835d1
|
File details
Details for the file refusal_cleaner-0.2.0-py3-none-any.whl.
File metadata
- Download URL: refusal_cleaner-0.2.0-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c22d4e881c0ee9d8b0c31a5f4f715a7e5198fefc913b531f180ca318af411ece
|
|
| MD5 |
ee735d975b01a811925ee16898d9a58a
|
|
| BLAKE2b-256 |
8934c32ee06c60972b49d3853c5d434ff231b84eb4b187eaca0b5dd26900a1fd
|