Skip to main content

Interactive Streamlit workbench for visualizing eye-tracking-while-reading scanpaths, computing reading measures, and exporting figures and tabular data.

Project description

Scanpath Studio

PyPI Live demo CI License: MIT

An interactive workbench for visualizing eye-tracking-while-reading data. Drop in a trial and see the scanpath the way the reader saw it: words at their true on-screen positions, fixations and saccades layered on top, a density heatmap, side-by-side trial comparisons, and animated replay — all tunable, and all exportable as publication-ready figures.

It is dataset-agnostic (auto-detects EyeLink / Gazepoint / snake-case columns) and ships with a small OneStop demo so you can try it with zero setup.

Authors: Omer Shubi, Keren Gruteke Klein, and others (TBD) — LACC Lab, Technion.

A reading scanpath replayed fixation by fixation

A scanpath replayed fixation by fixation over the text the reader saw (bundled OneStop demo).


Try it

Live demo (zero install): https://scanpath-studio.streamlit.app

Or run locally:

pip install scanpath-studio
scanpath-studio      # launches the app in your browser

What you can visualize

The plot is built from layers you can toggle independently:

  • Text — every word drawn at the exact pixel coordinates the participant saw.
  • Fixations — where the eye paused, sized and colored by any column in your data (duration, GPT-2 surprisal, word frequency, …).
  • Saccades — the jumps between fixations; backward jumps (regressions) stand out.
  • Areas of interest — word bounding boxes that tie each fixation to a word.
  • Heatmap — the trial aggregated into a word-level measure (total fixation duration, fixation count, …).

On top of the layered view:

  • Animated replay — watch the scanpath unfold fixation by fixation, at real or scaled speed; export it as interactive HTML or a self-playing GIF / MP4 clip for slides and papers.
  • Compare two trials — overlaid on one canvas or side-by-side (e.g. ordinary vs. information-seeking reading, first vs. repeated reading, L1 vs. L2).
  • Critical-span highlight — mark a region of interest (e.g. an answer span) by color or border to see at a glance whether it was read.
  • Out-of-text & by-line — flag fixations that land outside every word box, or color fixations by the text line they fall on.
  • Fully customizable — map any field to color, size, or axes; set the plot background (white or a neutral gray); every toggle, palette, and scale is independent.

Two readers of the same paragraph, animated on a shared real-time clock

Overlay a second reading to compare two readers of the same text on a shared real-time clock.


The four tabs

Tab What's there
Interactive Plot The layered scanpath view, trial picker (by trial / text / participant), trial metadata, and two-trial comparison.
Animated Scanpath Frame-by-frame replay; each frame lasts the actual fixation duration ÷ playback speed. Export as interactive HTML, GIF, or MP4.
Raw Data Paginated word, fixation, and raw-gaze tables, each with CSV + Parquet download.
Data Statistics Summary stats (mean fixation duration, saccade amplitude, regression rate, reading speed), a fixation-duration distribution, and a per-word reading-measure bar plot.

The Scanpath Studio app


Reading measures from raw fixations

If your data only carries raw fixations, the app computes the canonical per-word measures itself (pre-aggregated EyeLink columns, if present, take precedence):

Measure Definition
FFD — first fixation duration duration of the first fixation to land on the word
FPRT / gaze duration sum of fixations from first entry until the eye first leaves
RPD / go-past time sum of fixations from first entry until the eye first moves past the word
TFD / dwell sum of all fixations on the word
fixation count, skip, regression in/out, saccade amplitude standard reading-research flags and counts

Definitions follow Rayner (1998) and Inhoff & Radach (1998).


Triage your trials

Filter the trial pool by condition — information-seeking Hunting vs. ordinary Gathering reading, difficulty, first vs. repeated reading, answer correctness — or by your own annotations. Star favorites, tag trials (e.g. "To exclude"), and jot per-trial notes; download everything as a JSON sidecar and restore it in a later session.


Your data

Upload CSV, TSV, Parquet, or Feather tables for words/AoIs, fixations, and (optionally) raw gaze. Columns are auto-detected from common EyeLink, Gazepoint, and snake-case conventions; a sidebar Column mapping panel lets you override any guess. No single column uniquely identifies a trial? Map Trial ID to several columns (e.g. participant + paragraph + repeated reading) and a combined unique trial ID is built on the fly.

Areas of interest come straight from your word boxes — given as (x, y, width, height) or EyeLink's IA_LEFT/RIGHT/TOP/BOTTOM — the app never invents them. Fixations are tied to words by bounding-box containment (with a small nearest-word fallback); fixations that miss every box are flagged out-of-text.

Real corpora come in many shapes, so the loader bends to fit:

  • One file per participant or text. Drop in several files at once (or pass a glob / list of paths to the API and CLI) and they're concatenated, with each row tagged by its source_file so filename-encoded metadata isn't lost.
  • Only one report. Have just an interest-area report, or just fixations? Load either one alone — the missing layer is simply skipped, and a words-only table still draws a heatmap from its own pre-aggregated reading measures.
  • Stimulus-level AoIs. Word boxes given once per text (no participant column) are broadcast across every reader of that text.
  • Fixations as word/AoI sequences. No pixel coordinates, only "which word"? Fixations are placed at the matching word-box centers (or, for character-level AoIs like PoTeC's, at the fixated character's box).

PoTeC (Potsdam Textbook Corpus — 75 readers × 12 German textbook texts, one fixation file per reading and stimulus-level AoIs) loads as a worked example of all four:

import scanpath_studio as sps

words, fixations = sps.load_potec("data/PoTeC", download=True)   # ~45 MB on first call
fig = sps.plot_scanpath(words, fixations, "0", "b0", canvas_size=(1680, 1050))

or scanpath-studio render --potec data/PoTeC -p 0 -t b0 -o potec.png.

Heads-up: PoTeC's raw files can't be loaded through the generic upload flow — its trial/word ids live in filenames and fixation coordinates come from a separate character-AoI file. The dedicated loader handles that join. An in-app Public datasets source built on the same loaders is feature-flagged off for now and will appear in a future release.

When you upload your own tables and a required column can't be auto-detected, the app no longer stops — it shows your raw tables in the Raw Data tab so you can see the column names and finish the Column mapping in the sidebar.

Bulk export

One panel exports artifacts for every filtered trial into a single zip — per-trial PNG + SVG figures, the exact plot settings (plot_config.json), fixations, and per-word measures, plus aggregated tables across trials. Ideal for paper figures or building an image dataset of scanpaths for vision models.


Command line & Python API

Everything the app draws is also available headless — same pipeline, same canonical figure.

CLI — render a trial straight to a file:

scanpath-studio render --sample --list-trials         # what's available
scanpath-studio render --sample -o scanpath.html      # interactive HTML
scanpath-studio render --words ia.csv --fixations fixations.csv \
    -p participant_1 -t trial_3 --no-heatmap -o figure.png
scanpath-studio render --fixations 'fixations/*.tsv' -o scanpath.png   # multi-file, fixations-only
scanpath-studio render --potec data/PoTeC -p 0 -t b0 -o potec.png      # PoTeC corpus
scanpath-studio render --sample --animate -o replay.html

HTML output is browser-free; PNG/SVG/PDF go through Kaleido (install Chrome once with plotly_get_chrome -y). See scanpath-studio render --help for the full set of layer toggles. scanpath-studio on its own still launches the app, forwarding any extra args to streamlit run (e.g. --server.port 8502).

Python API — the same canonical figures programmatically:

import scanpath_studio as sps

words, fixations = sps.load_scanpath_data("ia.csv", "fixations.csv")
sps.list_trials(words, fixations)                # (participant, trial) combos
fig = sps.plot_scanpath(words, fixations, "participant_1", "trial_3")
sps.save_figure(fig, "scanpath.html")            # or .png/.svg/.pdf
anim = sps.animate_scanpath(words, fixations, "participant_1", "trial_3")
measures = sps.compute_word_metrics(words, fixations)  # FFD/FPRT/RPD/TFD…

sps.load_sample_data() returns the bundled demo, and plot_scanpath / animate_scanpath accept every layer toggle and style option the app exposes (show_heatmap=False, color_by="pass_index", …).

load_scanpath_data also takes glob patterns or lists of paths, and either table may be omitted for single-report datasets:

# one fixation file per participant, no separate IA report
words, fixations = sps.load_scanpath_data(fixations="fixations/*.tsv")
# or a ready-made loader for PoTeC's multi-file, stimulus-AoI layout
words, fixations = sps.load_potec("data/PoTeC", readers=[0, 1], texts=["b0"])

Run from source

git clone https://github.com/lacclab/scanpath-studio.git
cd scanpath-studio
pip install -e ".[test]"          # or: uv sync
streamlit run streamlit_app.py

Tested on Python 3.11–3.13. Run the tests with pytest; lint with ruff check --exclude other_vis .. See AGENTS.md for an architectural overview.


Citation

A system-demo paper is in preparation — citation TBD. Until then, cite the software via GitHub's "Cite this repository" button (generated from CITATION.cff).

If you use the bundled demo data, please cite the OneStop corpus:

@article{berzak2025onestop,
  title     = {{OneStop}: A 360-Participant {E}nglish Eye Tracking Dataset
               with Different Reading Regimes},
  author    = {Berzak, Yevgeni and Malmaud, Jonathan and Shubi, Omer
               and Meiri, Yoav and Lion, Ella and Levy, Roger},
  journal   = {Scientific Data},
  year      = {2025},
  publisher = {Nature Publishing Group},
  doi       = {10.1038/s41597-025-06272-2},
  url       = {https://www.nature.com/articles/s41597-025-06272-2},
}

The bundled demo is a subset of OneStop Eye Movements, used under its original license (docs).


License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scanpath_studio-0.18.0.tar.gz (736.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scanpath_studio-0.18.0-py3-none-any.whl (706.3 kB view details)

Uploaded Python 3

File details

Details for the file scanpath_studio-0.18.0.tar.gz.

File metadata

  • Download URL: scanpath_studio-0.18.0.tar.gz
  • Upload date:
  • Size: 736.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scanpath_studio-0.18.0.tar.gz
Algorithm Hash digest
SHA256 d9504c96d6a2ccc8df4dd692b86f6dd940678d616b242935773d4f767ede366f
MD5 60b7d2dcbcfc5696cab5f8496487cc6e
BLAKE2b-256 b449da46999c9f79b9b1b2b2522a2eef377d37b4b82f17cf9dcdd8b1f3a2a82d

See more details on using hashes here.

Provenance

The following attestation bundles were made for scanpath_studio-0.18.0.tar.gz:

Publisher: publish.yml on lacclab/scanpath-studio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scanpath_studio-0.18.0-py3-none-any.whl.

File metadata

File hashes

Hashes for scanpath_studio-0.18.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6b844fa9064efb3a03f67450603d708320eddb39b84369cc5c9bc2f37ac23f0c
MD5 479dc80acf671f8e8c6a8af11cfb2fb5
BLAKE2b-256 f38a1392e53af9c250f9ca9e9f705cc3a60845777d1074ccb7e5824d382d8e11

See more details on using hashes here.

Provenance

The following attestation bundles were made for scanpath_studio-0.18.0-py3-none-any.whl:

Publisher: publish.yml on lacclab/scanpath-studio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page