Skip to main content

The fastest way to collect human preference data for LLMs

Project description

feedloop

The fastest way to collect human preference data for LLMs.

feedloop is a free developer tool that lets you collect human feedback on LLM outputs — directly from your Python code, with zero configuration. Submit pairs of model responses, review them in a local browser UI, and export a fine-tuning dataset in minutes.


Features

  • Zero setuppip install feedloop and you're running
  • Local-first — everything runs on your machine, no cloud account needed
  • Non-blocking SDKfeedloop.compare() returns immediately, your script keeps running
  • Built-in review UI — side-by-side browser interface with keyboard shortcuts
  • Position randomization — A/B order is shuffled randomly to prevent left-side bias in human evaluations
  • DPO-ready export — outputs standard {"prompt", "chosen", "rejected"} JSONL
  • Uncertainty filtering — skip low-uncertainty comparisons automatically, focus human attention where it matters
  • Training script included — generates a ready-to-run TRL DPO fine-tuning script from your data
  • Session scoping — each run is isolated; data persists across sessions in SQLite
  • Model agnostic — works with OpenAI, Anthropic, Hugging Face, Ollama, or any LLM

Use Cases

  • Model comparison — compare GPT-4o vs Claude, or two versions of your own model
  • Fine-tuning data collection — build a DPO preference dataset without a labelling platform
  • Evaluation loops — quickly understand where your model falls short by seeing what humans prefer
  • Active learning — use uncertainty scores to only review the comparisons that matter most
  • Iterative improvement — collect feedback → fine-tune → re-run → repeat

Installation

pip install feedloop

Requires Python 3.10+. No other dependencies or accounts needed.


Quick Start

import feedloop

# Start the server (opens browser automatically)
feedloop.start()

# Submit pairs of outputs for review
feedloop.compare(
    prompt="Explain recursion to a 10-year-old.",
    outputs=[
        "Recursion is when a function calls itself...",
        "Imagine you're looking for a book in a library...",
    ],
    metadata={"model_a": "gpt-4o-mini", "model_b": "gpt-4o"},
)

# Rate in the browser, then export
feedloop.export("preferences.jsonl")

A browser tab opens at http://localhost:7856. Pick the better response with a click or use keyboard shortcuts:

Key Action
1 Choose the left response
2 Choose the right response
S Skip — neither response is clearly better

When you're done, export to a DPO-ready JSONL file.


Full API Reference

feedloop.start()

feedloop.start(
    port=7856,               # port for the local server
    db_path=None,            # SQLite path — defaults to ~/.feedloop/feedloop.db
    open_browser=True,       # auto-open browser on start
    uncertainty_threshold=0.0,  # see Uncertainty Filtering below
)

Launches the review server in a background thread. Idempotent — calling it twice reuses the running server. Automatically calls feedloop.stop() when your script exits.


feedloop.compare()

comparison_id = feedloop.compare(
    prompt="Your prompt here",
    outputs=["Response A", "Response B"],
    uncertainty=None,   # optional float 0.0–1.0
    metadata=None,      # optional dict — stored with the record
)

Submits a comparison for human review. Non-blocking — returns a comparison_id immediately. The A/B display order is randomized automatically to prevent position bias.


feedloop.wait()

# Block until a specific comparison is rated
result = feedloop.wait(comparison_id="abc123", timeout=60)
# → {"prompt": "...", "chosen": "...", "rejected": "...", "auto_skipped": False}

# Block until ALL pending comparisons in the session are rated
result = feedloop.wait(timeout=None)
# → {"completed": 10, "total": 12}

# Returns None on timeout

Useful when you want to act on feedback immediately — for example, in a pipeline that fine-tunes on each batch of ratings before generating the next round.


feedloop.status()

feedloop.status()
# → {"pending": 3, "completed": 7, "skipped": 1, "auto_skipped": 2, "total": 13}

Returns counts for the current session. Useful for progress checks in long-running scripts.


feedloop.export()

count = feedloop.export(
    path="preferences.jsonl",   # output file path
    format="dpo",               # only "dpo" supported in v1.x
)

Exports all human-rated comparisons from the current session to JSONL. Auto-skipped comparisons are excluded. Returns the number of rows exported.


feedloop.stop()

feedloop.stop()

Shuts down the background server and closes the database connection. Called automatically via atexit when your script exits — but useful to call explicitly in notebooks or long-lived processes where you want to release resources before the session ends.


Uncertainty-Based Filtering

Only review comparisons where your model is unsure — skip the easy ones automatically:

feedloop.start(uncertainty_threshold=0.6)

feedloop.compare(
    prompt="...",
    outputs=[response_a, response_b],
    uncertainty=0.85,  # above threshold → sent to human
)

feedloop.compare(
    prompt="...",
    outputs=[response_a, response_b],
    uncertainty=0.3,   # below threshold → auto-skipped
)

The uncertainty score is provided by you — feedloop just filters on it. How you compute it depends on your model:

  • Open-weight models (Llama, Mistral): use token log-probabilities
  • Any API: sample the same prompt multiple times and measure response disagreement — high variance = high uncertainty
  • Always review everything: omit uncertainty entirely (default behavior)

CLI Usage

You can run feedloop as a standalone review server — useful for reviewing data collected in a previous session:

feedloop --port 7856 --db ~/.feedloop/feedloop.db

Or via Python:

python -m feedloop --port 7856 --no-browser

Options:

Flag Default Description
--port 7856 Port to listen on
--db ~/.feedloop/feedloop.db Path to SQLite database
--no-browser off Don't open browser automatically

Exported Data Format

{"prompt": "Explain recursion...", "chosen": "Imagine you're looking for a book...", "rejected": "Recursion is when a function calls itself..."}

Compatible with TRL DPOTrainer, OpenRLHF, and any custom pipeline that accepts preference pairs.


Documentation

Full guide, API reference, and examples: turingspark.com/tools/feedloop

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

feedloop-1.5.3.tar.gz (159.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

feedloop-1.5.3-py3-none-any.whl (117.6 kB view details)

Uploaded Python 3

File details

Details for the file feedloop-1.5.3.tar.gz.

File metadata

  • Download URL: feedloop-1.5.3.tar.gz
  • Upload date:
  • Size: 159.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for feedloop-1.5.3.tar.gz
Algorithm Hash digest
SHA256 a736a5efc24e4680780ce2aa6eedd43a43f280153de44b83783e53a3878db384
MD5 536053d062bb6062c0d42b464b11a2cf
BLAKE2b-256 3f506be672bb7a4692cd4f77a36930f9cf231a9d59d9bedd2e62fe5ac5a53e2b

See more details on using hashes here.

Provenance

The following attestation bundles were made for feedloop-1.5.3.tar.gz:

Publisher: release.yml on turing-spark/feedloop

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file feedloop-1.5.3-py3-none-any.whl.

File metadata

  • Download URL: feedloop-1.5.3-py3-none-any.whl
  • Upload date:
  • Size: 117.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for feedloop-1.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 711b5f4188574b3ed611be3b58535f36e3392cee9a96a5a41413c708e44540f0
MD5 e55689feca624a1e9e6f75b01b93ce16
BLAKE2b-256 bbb249fbe1f2619af23c612985a572efe4c23807f52a9c885220f0474ca5319f

See more details on using hashes here.

Provenance

The following attestation bundles were made for feedloop-1.5.3-py3-none-any.whl:

Publisher: release.yml on turing-spark/feedloop

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page