Cross-platform pupil and gaze tracking library with a macOS-specific PySide6 desktop demo.
Project description
Pupil Tracker
A webcam-first pupil and gaze tracking library with a macOS desktop demo application.
The MVP provides a Python library plus PySide6/Qt demo shell for coarse gaze tracking experiments: webcam capture, MediaPipe-based iris/face observations, timed quality-gated 9-point calibration, post-calibration validation metrics, polynomial/ridge gaze calibration, 3x3 screen-region mapping, confidence-aware transparent gaze/validation overlays, a gaze heatmap, macOS visible-window candidate scoring, opt-in gaze-to-focus, and opt-in JSONL telemetry.
Gaze-assisted application focus is available only behind the explicit Gaze Focus toggle or PUPIL_TRACKER_GAZE_FOCUS_ENABLED=true. The default demo path still observes windows without focusing, raising, clicking, or activating them.
Status
This repo contains the first macOS-focused MVP implementation slices and automated tests. Hardware/live-GUI validation is still manual because the project uses the local camera, desktop overlay, and macOS window enumeration.
Implemented library/demo areas:
- Core immutable models for observations, calibration samples, gaze samples, and window candidates.
- 9-point calibration target generation and sample collection.
- Timed calibration phases with settle/capture/review windows and quality-gated retries.
- Polynomial/ridge calibration model.
- Post-calibration validation targets, validation session controller, and mean/median/max error metrics.
- Exponential moving-average gaze smoothing.
- 3x3 screen-region classification.
- Pluggable tracker backend protocol.
- OpenCV camera source.
- MediaPipe Tasks/FaceLandmarker-backed tracker adapter with injectable fakes for tests.
- Synchronous runtime pipeline for camera/backend/calibration/smoothing/region mapping.
- PySide6 desktop demo shell with camera, calibration, validation, overlay, heatmap, opt-in gaze focus, and telemetry controls.
- Transparent confidence-aware gaze overlay and validation target/prediction/error-line overlay.
- Gaze trail and heatmap verification helpers for live accuracy checks.
- macOS CoreGraphics visible-window enumeration, pure candidate scoring, and separate opt-in AppKit application activation.
- Privacy-conscious JSONL telemetry with no frame/video payloads by default.
Requirements
- macOS for the desktop MVP.
- Python 3.11.
uvinstalled.- A webcam for live tracking.
- macOS camera permission for live camera usage.
- A MediaPipe FaceLandmarker model asset for real MediaPipe Tasks inference when using the default backend path.
Accessibility permission is not required for candidate scoring. The optional Gaze Focus mode uses AppKit application activation, does not synthesize clicks, and reports an in-app focus unavailable status if macOS refuses activation.
Setup
make sync
This runs uv sync --dev and installs the locked runtime/dev environment.
Distribution and Releases
This package has a cross-platform Python library core plus a macOS-specific desktop demo/window-focus layer. The PyPI metadata advertises the library as OS-independent while marking the macOS demo support, and macOS-only PyObjC dependencies are guarded with a Darwin environment marker. GitHub Actions runs CI, semantic release, package builds, and PyPI trusted publishing from ubuntu-latest.
Stable releases use SemVer tags generated by Python Semantic Release from Conventional Commits on main. The release workflow updates [project].version, creates the matching vMAJOR.MINOR.PATCH tag, builds the package, and publishes only when a release was created.
Publishing uses PyPI Trusted Publishing with the GitHub Actions environment named pypi; configure that environment as a trusted publisher in the PyPI project before the first semantic release.
Local release checks:
make release-check
make build
Verification
Run all automated checks:
make check
This runs:
ruff check src apps teststy check src apps testspytest -v
Optional diff hygiene before committing:
git diff --check
Launch the Demo
Real tracker/calibration mode requires a MediaPipe FaceLandmarker model asset. The easiest path is to download the default model into models/:
make download-model
export PUPIL_TRACKER_MEDIAPIPE_MODEL=$(pwd)/models/face_landmarker.task
make run-demo
You can also point the demo at any compatible .task file directly:
PUPIL_TRACKER_MEDIAPIPE_MODEL=/absolute/path/to/face_landmarker.task make run-demo
If this variable is missing or points to a non-existent file, camera preview can still start, but tracker-backed calibration will show in-app setup guidance instead of failing silently.
make run-demo
The demo launches the PySide6 desktop shell. It does not start the camera on import; camera use happens only after explicit user interaction.
Expected manual path:
- Click Start Camera and confirm live preview.
- Center your face and confirm tracker annotations appear.
- Click Start Calibration and follow the 9 fullscreen targets. For replay-backed geometry experiments, use Start Edge-Dense Calibration for the 17-point edge/corner path, Start Top-Left Focus Calibration for the 25-point top-left
v0collapse path, or Start Top-Row Focus Calibration for the 33-pointv0/v1top-row collapse path. - Hold gaze through each target's Settle and Capture phases.
- Confirm calibration completes with fit metrics.
- Click Start Validation and follow the validation targets.
- Confirm target dot, predicted dot, error line, and validation metrics are understandable.
- Move gaze around the screen and confirm overlay, 3x3 region, heatmap, and window-candidate debug text update plausibly.
- Optional: enable Gaze Focus and confirm the app under the current gaze candidate comes forward; disable it before continuing accuracy diagnostics if focus changes are distracting.
- If logging is enabled, confirm JSONL telemetry contains scalar events only and no frame/image payloads.
- Stop Camera or close the app and confirm camera/tracker/overlay/log resources are released.
Manual live testing should follow docs/manual-test-checklist.md.
Calibration, Validation, and Accuracy Checks
Calibration is accuracy-first. The demo intentionally asks for stable timed samples before trusting a fit:
- Settle: look at the target dot and hold still. Samples are ignored during this short window.
- Capture: keep looking at the target. Valid, confident observations are counted.
- Review: the session checks accepted/rejected counts. Low-quality targets are retried instead of silently advancing.
After calibration completes, run validation before judging tracking quality. Validation uses held-out target points and reports:
- Mean error: average distance between validation target and predicted gaze.
- Median error: typical distance, less sensitive to outliers.
- Max error: worst observed distance.
- Mean X error: average horizontal miss distance.
- Mean Y error: average vertical miss distance.
- Y bias: signed vertical offset; positive means predictions are lower on screen than the target, negative means predictions are higher.
- Grid accuracy: practical same-cell hit rate for a configurable validation grid. Defaults to
4x3, configurable withPUPIL_TRACKER_VALIDATION_GRID_COLUMNSandPUPIL_TRACKER_VALIDATION_GRID_ROWS. - Recommendation:
excellent,good,usable, orretry.
Use the validation overlay to diagnose failures. The target dot is where you should look, the predicted dot is the calibrated estimate, and the line between them is the current error. If vertical tracking feels weaker than horizontal tracking, compare Mean X error, Mean Y error, and Y bias before changing model settings:
- High Y bias in the same direction across runs usually means camera angle, seating position, or calibration posture is systematically offset. Reposition the camera, reduce head pitch, improve lighting, and recalibrate.
- High Mean Y error with low signed Y bias usually means vertical estimates are noisy or compressed around the center. That points to feature extraction improvements rather than blind model tuning.
- High X and Y error together usually means the full calibration was poor; improve face visibility/head stability and retry.
If the recommendation is retry, improve lighting/camera position, reduce head movement, and recalibrate.
Calibration targets are shown fullscreen because the fitted model maps observations to full-monitor coordinates. The outer 9-point targets remain inset from the physical edges so they sample the usable screen area without forcing hard-to-hold edge fixations.
Start Edge-Dense Calibration is an experimental, non-default geometry check for top-row and edge/corner failures seen in replay analysis. It uses 17 targets: denser top/bottom edge rows, upper/lower quadrant points aligned with validation hot spots, and middle left/center/right anchors. Use it for a fresh logged manual validation run before changing live defaults.
Start Top-Left Focus Calibration is a second experimental, non-default geometry check for the persistent v0 top-left collapse. It uses 25 targets: the edge-dense broad anchors plus a 3x3 local cluster around the held-out top-left validation region (0.25, 0.25). Use it only for logged repeat-run comparison against the latest edge-dense runs; do not treat one improved run as a default-change signal.
Start Top-Row Focus Calibration is a third experimental, non-default geometry check for paired v0/v1 top-row collapse. It uses 33 targets: broad edge anchors plus two 3x3 local clusters around the held-out top validation regions (0.25, 0.25) and (0.75, 0.25). Use it only after top-left focus evidence shows top-row failures move laterally or affect both top validation targets; compare predicted-cell distributions for v0, v1, v3, and v4 before changing defaults.
For longer live checks, enable Show Heatmap and stare at fixed points. The heatmap should cluster where you hold your gaze. Use Clear Heatmap between trials.
Gaze Focus is off by default. Turn on the Gaze Focus button, or launch with PUPIL_TRACKER_GAZE_FOCUS_ENABLED=true, to immediately activate the current visible-window candidate when calibrated gaze lands on it. The activation path is app-level focus by macOS process id; it does not click inside the target app and it avoids repeated activation while gaze remains on the same candidate.
Privacy and Telemetry
The app is privacy-conscious by default:
- No camera video is recorded by default.
- No frame/image arrays are written to telemetry by default.
- JSONL telemetry is opt-in through Start Logging / Stop Logging controls.
- Default demo telemetry path is under
metrics/, which is ignored by git. - Telemetry serializers include scalar summaries such as timestamps, gaze coordinates, confidence, calibration target ids, sample counts, calibration quality, feature diagnostics, replayable scalar feature samples, validation samples, validation metrics, and visible-window candidate metadata.
After a logged calibration run, inspect feature separability with:
uv run python tools/analyze_feature_diagnostics.py metrics/demo.jsonl
Use the report to compare top/center/bottom feature deltas before adding new gaze features or tuning the calibration model.
After a fresh logged calibration and validation run, compare calibration model variants offline with:
uv run python tools/evaluate_calibration_models.py metrics/demo.jsonl --screen-width 1512 --screen-height 982 --grid-columns 4 --grid-rows 3 --objective grid --calibration-sample-window middle
Use the same screen dimensions as the manual run. The evaluator uses only calibration_replay_sample and validation_replay_sample scalar payloads, so it can compare candidate models without saving frames or re-running the camera session. Use --calibration-sample-window all|early|middle|late to test whether target-capture timing affects the model fit. The evaluator also includes replay-only target-weighted candidates for vertical edges, screen edges, and corners, replay-only asymmetric quadrant correction candidates, plus replay-only vertical-bias and per-band correction candidates; these are comparisons, not live behavior. Add --include-target-residuals when a run regresses to append per-target calibration and validation residual tables for the top-ranked model.
When two logged live runs disagree, compare target-specific validation behavior before changing defaults:
uv run python tools/analyze_repeat_run_diagnostics.py metrics/demo.jsonl --run START1:END1 --run START2:END2 --screen-width WIDTH --screen-height HEIGHT --grid-columns 4 --grid-rows 3
The repeat-run analyzer uses scalar validation_sample, validation_metrics, and calibration_replay_sample events, trims each run to the latest metrics sample window per target, and reports signed residual shifts, grid collapse/recovery flags, predicted grid-cell distributions, and calibration feature-drift deltas with named dominant feature changes. It does not require or emit frames, screenshots, or landmark dumps.
To validate the live late-sample policy without changing the default, launch the demo with:
PUPIL_TRACKER_CALIBRATION_SAMPLE_WINDOW=late PUPIL_TRACKER_MEDIAPIPE_MODEL=$(pwd)/models/face_landmarker.task make run-demo
The default live calibration sample window remains all; use late only for replay-backed manual validation until a fresh run confirms it improves practical grid/window selection.
To test the opt-in solvePnP-style pose-geometry suffix during calibration capture, launch with:
PUPIL_TRACKER_SOLVEPNP_STYLE_FEATURES=true PUPIL_TRACKER_MEDIAPIPE_MODEL=$(pwd)/models/face_landmarker.task make run-demo
The default live MediaPipe feature vector remains the stable 23-feature vector. The solvePnP-style suffix appends chin and mouth geometry for scalar diagnostics and replay experiments only; keep it off unless a manual run is explicitly testing that hypothesis.
To test the opt-in posture/head-pose stability gate during calibration capture, launch with a positive feature-drift threshold:
PUPIL_TRACKER_POSTURE_STABILITY_MAX_DELTA=0.05 PUPIL_TRACKER_MEDIAPIPE_MODEL=$(pwd)/models/face_landmarker.task make run-demo
The posture gate compares each target's captured samples against the first accepted sample for that target using the head-pose proxy features: roll, yaw, and pitch. Samples whose selected feature drift exceeds the threshold are rejected before calibration storage. Calibration start logs a scalar calibration_config event with the active path, target count, model, sample window, screen size, posture threshold, and posture feature indices so repeat-run analysis can confirm the exact test condition. Keep this gate experimental until a logged validation run shows better 4x3 grid accuracy without moving failures to other targets.
To test the opt-in posture-plus-face-context stability gate, launch with:
PUPIL_TRACKER_CONTEXT_STABILITY_MAX_DELTA=0.012 PUPIL_TRACKER_MEDIAPIPE_MODEL=$(pwd)/models/face_landmarker.task make run-demo
Only one stability gate may be active per run. The context gate uses scalar face context plus posture indices 14,15,16,17,18,20,21,22 and logs the generic stability_gate_name, stability_gate_max_delta, and stability_gate_feature_indices fields in calibration_config. Keep it opt-in and judge it by decision-aware accepted/rejected counts plus validation grid accuracy; outside-envelope replay is a risk signal, not a promotion rule.
Any future video/frame capture feature must be explicit opt-in and documented separately.
Known MVP Limitations
- Commodity webcam gaze tracking is coarse; expect screen-region/window-level utility, not pixel-perfect cursor replacement.
- Accuracy depends heavily on lighting, camera placement, face visibility, head movement, and calibration quality.
- The demo is macOS-first and developer-oriented; Windows/Linux packaging is out of scope for the MVP.
- Multi-monitor behavior is not fully specified.
- The MediaPipe backend uses the installed MediaPipe Tasks API; real inference requires an appropriate FaceLandmarker model asset path.
- Live GUI/hardware behavior still needs manual validation on each target Mac.
- The app enumerates and scores visible windows for debug purposes only and does not change focus.
Repository Layout
pupil-tracker/
src/pupil_tracker/ # importable library package
apps/desktop_demo/ # PySide6 desktop demo app
tests/ # unit and headless smoke tests
docs/
requirements.md # interview decisions and MVP requirements
plans/ # implementation plans
manual-test-checklist.md
The demo app consumes the library rather than owning core tracking, calibration, or platform logic.
Development Conventions
- Use
uvfor dependency and lockfile management. - Use
make checkbefore commits. - Use standard-library
loggingthroughpupil_tracker.logging_config; avoidprint/printf-style diagnostics in source code. - Keep automated tests hardware-free: use fakes for OpenCV, MediaPipe, Qt, and CoreGraphics where possible.
- Keep core library behavior independent of Qt/OpenCV/MediaPipe where practical.
Documentation
Start here:
docs/requirements.md— product/research decisions, MVP scope, non-goals, and resolved implementation choices.docs/plans/mvp.md— high-level implementation plan.docs/plans/implementation-task-plan.md— completed task-by-task TDD execution plan.docs/manual-test-checklist.md— manual live-camera/live-GUI validation steps.
Licensing Posture
The core project uses the MIT License and is permissive-first. GPL eye-tracking projects may be used as research references, but GPL code should not be copied into the core package. Optional GPL-compatible adapters may be considered later only with clear licensing boundaries.
Non-Goals for MVP
- Pixel-perfect mouse replacement.
- Actual app/window focus changes.
- Windows or Linux support.
- Wayland global overlay/focus behavior.
- Video/frame recording by default.
- Product-polished UI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pupil_tracker-1.0.1.tar.gz.
File metadata
- Download URL: pupil_tracker-1.0.1.tar.gz
- Upload date:
- Size: 284.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49e2d6732a92e4aee73726b241ea3fc5a144eca72e2131899291134dadbfe699
|
|
| MD5 |
6d9ab41a0e0357a07abd343b997a579c
|
|
| BLAKE2b-256 |
3df903a371da50181759540bfd76f3e5a1fb0f4633e0443534960a0a723e4a73
|
Provenance
The following attestation bundles were made for pupil_tracker-1.0.1.tar.gz:
Publisher:
release.yml on sagebynature/pupil-tracker
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pupil_tracker-1.0.1.tar.gz -
Subject digest:
49e2d6732a92e4aee73726b241ea3fc5a144eca72e2131899291134dadbfe699 - Sigstore transparency entry: 1564804951
- Sigstore integration time:
-
Permalink:
sagebynature/pupil-tracker@517499ab22f4db10d95c2dfb79f19678156d3ef9 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sagebynature
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@517499ab22f4db10d95c2dfb79f19678156d3ef9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pupil_tracker-1.0.1-py3-none-any.whl.
File metadata
- Download URL: pupil_tracker-1.0.1-py3-none-any.whl
- Upload date:
- Size: 65.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb7b6afe6408f961c73814df0a1ae659c09a9f7e5de48829cfbf3c8671c41dd7
|
|
| MD5 |
7ff6ea962ce3b315118d423e68dc0cfe
|
|
| BLAKE2b-256 |
771303ca85c36c8cd8b97d275bc643bd042dbd5c30f15f51c3858327f1b3c923
|
Provenance
The following attestation bundles were made for pupil_tracker-1.0.1-py3-none-any.whl:
Publisher:
release.yml on sagebynature/pupil-tracker
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pupil_tracker-1.0.1-py3-none-any.whl -
Subject digest:
fb7b6afe6408f961c73814df0a1ae659c09a9f7e5de48829cfbf3c8671c41dd7 - Sigstore transparency entry: 1564804965
- Sigstore integration time:
-
Permalink:
sagebynature/pupil-tracker@517499ab22f4db10d95c2dfb79f19678156d3ef9 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sagebynature
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@517499ab22f4db10d95c2dfb79f19678156d3ef9 -
Trigger Event:
push
-
Statement type: