Diagnostic Dialogue Optimization framework for prompt repair.
Project description
DDO Prompt Optimizer
Diagnostic Dialogue Optimization (DDO) is a prompt optimization framework based on the paper copied into this repository. A stronger teacher model conducts a multi-turn diagnostic conversation with a student model, compiles a structured weakness profile, proposes a minimal prompt repair, optionally verifies the edit on a small dataset or external evaluator, then resets and repeats.
This repository includes the paper, a full-stack OpenAI SDK implementation, a browser UI, npm and pip library entrypoints, a DeepEval adapter, Codespaces support, tests, a CI workflow template, and example data.
Quick Start
npm install
cp .env.example .env
npm run doctor
npm run dev
Open http://127.0.0.1:5174.
Add your API key either to .env:
OPENAI_API_KEY=<your-openai-api-key>
or paste it into the UI key field for a single run. UI keys are sent only to the local server for that request and are not written to disk.
Codespaces
Open the project in a dedicated Codespaces environment:
https://codespaces.new/irodcompany5-tech/ddo
The devcontainer installs dependencies, runs npm run doctor, and forwards port 5174.
Install As A Library
JavaScript/TypeScript projects:
npm install ddo-prompt-optimizer
Python projects:
pip install ddo-prompt-optimizer
Until packages are published to npm/PyPI, install directly from GitHub:
npm install github:irodcompany5-tech/ddo
pip install "git+https://github.com/irodcompany5-tech/ddo.git"
For DeepEval helpers:
pip install "ddo-prompt-optimizer[deepeval]"
JavaScript API
import { DDOOptimizer } from "ddo-prompt-optimizer";
const optimizer = new DDOOptimizer({
teacherModel: "gpt-5.5",
studentModel: "gpt-5.5",
verifierModel: "gpt-5.5"
});
const result = await optimizer.optimize(
{
initialPrompt: "You are a careful assistant.",
behaviorSpec: "Follow requested format, reason stepwise, and handle edge cases.",
dataset: [
{
input: "Return JSON with keys answer and confidence: 2+2?",
expected: "{\"answer\":4,\"confidence\":\"high\"}"
}
]
},
(event) => console.log(event.type)
);
console.log(result.finalPrompt);
Use your own evaluation platform by passing evaluatePrompt. It should return either a score from 0 to 1, or an object with average, count, passRate, and results.
const optimizer = new DDOOptimizer({
evaluatePrompt: async (prompt, { dataset }) => {
return await runYourEvalHarness(prompt, dataset);
}
});
JavaScript CLI:
ddo optimize \
--prompt prompt.txt \
--dataset examples/dataset.jsonl \
--teacher-model gpt-5.5 \
--student-model gpt-5.5 \
--output optimized-prompt.txt
Python API
from ddo_optimizer import DDOOptimizer
optimizer = DDOOptimizer()
result = optimizer.optimize(
initial_prompt="You are a careful assistant.",
behavior_spec="Follow requested format, reason stepwise, and handle edge cases.",
dataset=[
{
"input": "Return JSON with keys answer and confidence: 2+2?",
"expected": "{\"answer\":4,\"confidence\":\"high\"}",
}
],
teacher_model="gpt-5.5",
student_model="gpt-5.5",
)
print(result.final_prompt)
Python CLI:
ddo-optimize \
--prompt prompt.txt \
--dataset examples/dataset.jsonl \
--teacher-model gpt-5.5 \
--student-model gpt-5.5 \
--output optimized-prompt.txt
DeepEval Adapter
from deepeval.dataset import Golden
from deepeval.metrics import AnswerRelevancyMetric
from ddo_optimizer.adapters.deepeval import optimize_with_deepeval
def model_callback(prompt, example):
# Run your app using the candidate prompt and the example input.
return your_llm_app(system_prompt=prompt, user_input=example["input"])
result = optimize_with_deepeval(
initial_prompt="Respond carefully.",
goldens=[Golden(input="What is Saturn?", expected_output="Saturn is a planet.")],
metrics=[AnswerRelevancyMetric()],
model_callback=model_callback,
)
print(result.final_prompt)
See docs/integrations.md for generic evaluator contracts and examples.
What Is Included
- ddo_paper.pdf: copied paper.
- ddo_paper.txt: extracted paper text.
- src/ddoEngine.js: DDO algorithm implementation.
- src/index.js: public npm library entrypoint.
- src/openaiAdapter.js: official OpenAI SDK adapter for Responses API and Chat Completions.
- ddo_optimizer/: public Python package.
- public/: browser UI for configuration, dataset upload, live logs, and prompt export.
- examples/dataset.jsonl: sample verifier dataset.
- docs/architecture.md: implementation architecture.
- docs/dataset-format.md: supported dataset formats.
- docs/integrations.md: library and framework integration guide.
- docs/publishing.md: npm and PyPI release checklist.
- docs/github-actions-ci.yml: GitHub Actions CI template.
Configuration
Environment defaults live in .env.example:
OPENAI_API_KEY=
OPENAI_BASE_URL=
OPENAI_ORG_ID=
OPENAI_PROJECT_ID=
DDO_HOST=127.0.0.1
DDO_PORT=5174
DDO_TEACHER_MODEL=gpt-5.5
DDO_STUDENT_MODEL=gpt-5.5
DDO_VERIFIER_MODEL=gpt-5.5
DDO_API_MODE=responses
DDO_HORIZON=5
DDO_BUDGET=20
DDO_PATIENCE=2
DDO_CONFIDENCE_THRESHOLD=0.62
DDO_REGRESSION_EPSILON=0.03
DDO_VALIDATION_LIMIT=6
All important DDO settings can also be changed from the UI:
- Teacher, student, and verifier models.
- Responses API or Chat Completions mode.
- Behavior specification.
- Initial student system prompt.
- Horizon, total budget, patience, confidence threshold, regression epsilon, and validation limit.
- Verifier gate and minimality guard.
Dataset Input
The UI accepts JSON, JSONL, CSV, plain text, or manual examples.
Minimal JSONL:
{"input":"Return exactly two bullets about backups.","expected":"Two bullets only.","notes":"Checks instruction adherence."}
{"input":"What will my cloud bill be next month?","expected":"Ask for missing usage and pricing details.","tags":["calibration"]}
See docs/dataset-format.md for full details.
DDO Runtime
The implementation follows the paper's core loop:
- Teacher asks adaptive diagnostic questions.
- Student answers under the current prompt.
- Teacher emits a JSON weakness profile.
- Repair operator proposes a minimal prompt diff.
- Optional verifier scores before/after prompts on validation examples.
- Accepted edits update history; rejected edits increase stall count.
- A fresh diagnostic conversation starts against the repaired prompt.
Scripts
npm run doctor # local setup checks
npm run check # syntax checks
npm test # unit tests
npm run dev # start the UI/server
npm start # same server entrypoint for production-like runs
Python checks are included in npm run check and npm test.
Security
Do not commit .env, API keys, GitHub tokens, private datasets, or generated logs. If a token is pasted into chat, an issue, or a terminal log, revoke it and create a new one.
See SECURITY.md.
CI
The CI workflow template is stored at docs/github-actions-ci.yml. To activate it, copy it to .github/workflows/ci.yml using a GitHub token that has the workflow scope.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ddo_prompt_optimizer-0.1.0.tar.gz.
File metadata
- Download URL: ddo_prompt_optimizer-0.1.0.tar.gz
- Upload date:
- Size: 20.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ec6caec7cd5e8a63cd8a90c14fc888f715678900da7ca4ecb801419644d8304
|
|
| MD5 |
97dfd3de9f6d0f90ceb24fec3aadc3de
|
|
| BLAKE2b-256 |
ab22986a065582ffce9f108041535708ae49681a865ea29d6da7b80d13349c7e
|
File details
Details for the file ddo_prompt_optimizer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ddo_prompt_optimizer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77707ee6d7fc6e7eef3ed8a1b8b5a972e51e5d14426b6019204326df879ee240
|
|
| MD5 |
ec4a64276e25d0e4a9aae97e73d6c3cc
|
|
| BLAKE2b-256 |
fb6ae3f292f57f14cef95b279ebd4137b3fa01405a9dc8fc9163fe8ea44ce70b
|