Skip to main content

Automatic classification and localization of fluctuating signals in spectrograms

Project description

TokEye Logo

TokEye

Python package

TokEye is a open-source Python-based application for automatic classification and localization of fluctuating signals. It is designed to be used in the context of plasma physics, but can be used for any type of fluctuating signal.

Check out this poster from APS DPP 2025 or this preprint for more information.

Example Demonstration

Expected processing time:

  • V100: < 0.5 seconds on any size spectrogram after warmup.
  • CPU: ~5-10 seconds.

Quickstart

pip install tokeye   # or: uv tool install tokeye
tokeye app           # opens web app on http://localhost:7860
  • The default model downloads automatically from Hugging Face on first use (~30 MB, cached — no manual setup).
  • No data handy? Click "Load Example Signal" in the app, or generate one from the shell with tokeye example.
  • pip install requires Python >= 3.13; uvx/uv tool install fetch a compatible Python automatically.

Zero-install trial: uvx tokeye app runs the app without installing anything into your environment.

Batch processing (CLI)

For headless / scripted use (no browser needed), run inference directly:

tokeye run "shots/*.npy" --output-dir results

INPUT arguments can be files, directories (all *.npy files inside are used), or quoted glob patterns. Each input is interpreted by its shape:

  • 1D array — a raw time series. TokEye computes its STFT spectrogram using the flags below before running inference.
  • 2D array — a precomputed spectrogram, fed to the model directly.

For each input file, tokeye run writes:

  • <stem>_mask.npy — float32 array, shape (2, H, W), sigmoid scores per pixel (channel 0 = coherent, channel 1 = transient).
  • <stem>_preview.png — a grayscale spectrogram with the mask overlaid (green = coherent, red = transient), unless --no-png is passed.

The process exit code is the number of files that failed.

Flags:

Flag Default Description
--model big_tf_unet Registry name or path to a .pt/.pt2 checkpoint.
--output-dir tokeye_output Directory for masks and previews.
--n-fft 1024 STFT window size (1D inputs only).
--hop 256 STFT hop size (1D inputs only).
--keep-dc off Keep the DC bin (dropped by default).
--clip-low / --clip-high 1.0 / 99.0 Percentile clip bounds applied to the spectrogram.
--threshold 0.5 Mask threshold used only for the preview PNG overlay.
--no-png off Skip preview PNGs; write masks only.
--device auto cpu, cuda, or auto.

The released model was trained on spectrograms built with hop=128; for closest match to the training configuration use --hop 128.

On HPC clusters where compute nodes have no internet access, pre-fetch the weights on the login node, then run the batch job on the compute node:

tokeye download big_tf_unet   # on the login node; prints the cached path
tokeye run ... --model big_tf_unet   # on the compute node — model is already cached

Web app guide

tokeye app (or python -m tokeye.app) launches a Gradio interface with three tabs:

  • Analyze — load a signal, compute its spectrogram, run a model, and visualize the result. Guided for first-time use: the model dropdown defaults to the bundled big_tf_unet model, the STFT transform has working defaults, and "Load Example Signal" generates a synthetic demo signal so a brand-new user needs zero files. "Analyze" runs the whole load-model → infer → visualize pipeline in one click. View modes: Original, Enhanced (percentile-clipped amplitude), Mask (thresholded model output), Amplitude.
  • Annotate — manually draw and save mask annotations over a read-only backdrop image.
  • Utilities — audio-format conversion and .npy file inspection.

Flags: tokeye app [--port 7860] [--share] [--open]--share creates a public Gradio link, --open opens a browser tab on launch.

If you're on a remote server (e.g. an HPC login node), forward the port over SSH instead of using --share:

ssh -L 7860:localhost:7860 user@remote

Then open http://localhost:7860 in your local browser.

Verified Datatypes

  • DIII-D Fast Magnetics (cite)
  • DIII-D CO2 Interferometer (cite)
  • DIII-D Electron Cyclotron Emission (cite)
  • DIII-D Beam Emission Spectroscopy (cite)

Evaluation

Recall Scores:

  • TJII2021: 0.8254
  • DCLDE2011 (Delphinus capensis): 0.7708
  • DCLDE2011 (Delphinus delphis): 0.7953

With more data, comes better models. Please contribute to the project!

Installation (from source / development)

uv is the dev tool for this repo:

git clone git@github.com:PlasmaControl/TokEye.git
cd TokEye
uv sync             # core deps
uv sync --dev       # + pytest, ruff, etc.
uv sync --group train  # + training deps (lightning, h5py, etc.)

This creates a .venv/; activate it with source .venv/bin/activate, or prefix commands with uv run.

Models

Registry name HF file Description
big_tf_unet big_tf_unet_251210.pt Transformer U-Net trained on multiscale (multiwindow, multihop) spectrograms.

Weights are hosted on Hugging Face and download automatically the first time a registry name is used (cached in ~/.cache/huggingface). Override the source repo with the TOKEYE_HF_REPO environment variable.

To use a local checkpoint instead, put .pt/.pt2 files in a model/ directory (picked up by the app's model dropdown) or pass a path directly via --model PATH.

Input should be a tensor that has shape (B, 1, H, W) where B, H, and W can vary Output will be a tensor of shape (B, 2, H, W)

Best performance when spectrograms are oriented so that when they are plotted with matplotlib, the lowest frequency bin is oriented with the bottom when origin='lower'. Spectrograms should be standardized (mean = 0, std = 1). If baseline activity is very strong, clipping the input may help, but is generally not needed.

The first channel of the output will return preferential measurements of coherent activity (useful for most tasks) The second channel of the output will return preferential measurements of transient activity

Data

Keep signals as 1D numpy float arrays (raw time series) — no need to normalize or preprocess them. The CLI also accepts 2D arrays (precomputed spectrograms) directly. The app scans a signal directory for .npy files (default data/input, configurable in the Analyze tab).

Bringing your own data takes two lines:

import numpy as np

signal = ...  # any 1D float array: tokamak diagnostic, hydrophone, etc.
np.save("shots/myshot.npy", signal)
tokeye run shots/myshot.npy --output-dir results

No data yet? tokeye example writes a synthetic demo signal you can run immediately, and the web app has a matching "Load Example Signal" button.

Development

uv sync --dev
uv run ruff check .
uv run pytest

Citation

If you use this code in your research, please cite:

@article{chen_TokEye_2026,
  title={TokEye: Fast Signal Extraction for Fluctuating Time Series via Offline Self-Supervised Learning From Fusion Diagnostics to Bioacoustics},
  author={Chen, Nathaniel},
  year={2026},
  publisher={ArXiv},
  doi={10.48550/arXiv.2602.20317},
  url={https://www.arxiv.org/abs/2602.20317}
}

Contact

Nathaniel Chen — nathaniel [at] princeton [dot] edu — https://nathanielchen.net

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokeye-0.10.0.tar.gz (3.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokeye-0.10.0-py3-none-any.whl (3.2 MB view details)

Uploaded Python 3

File details

Details for the file tokeye-0.10.0.tar.gz.

File metadata

  • Download URL: tokeye-0.10.0.tar.gz
  • Upload date:
  • Size: 3.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Red Hat Enterprise Linux","version":"8.10","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for tokeye-0.10.0.tar.gz
Algorithm Hash digest
SHA256 35b8fba6b9e27b7f132b125f7a09ee49648118b370f900357c6e1c8406942047
MD5 872c47001535a7013bc23021604c1349
BLAKE2b-256 51da8e7c7caf78b02f25f5e728993103330d85f9d215864b88a5696232f19352

See more details on using hashes here.

File details

Details for the file tokeye-0.10.0-py3-none-any.whl.

File metadata

  • Download URL: tokeye-0.10.0-py3-none-any.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Red Hat Enterprise Linux","version":"8.10","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for tokeye-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c72a96fdd51f7f1b9a0a2c3d24222fe2bf71e1f0f27da15cc50e0ee1264aef9c
MD5 d27180f88186bc65455b2ccae9e38f37
BLAKE2b-256 a4e5ddaff7a1238b503acad0fc9c9d75a4abbcc198a3df5f1e84aa110f9cd235

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page