Automatic classification and localization of fluctuating signals in spectrograms
Project description
TokEye
TokEye is a open-source Python-based application for automatic classification and localization of fluctuating signals. It is designed to be used in the context of plasma physics, but can be used for any type of fluctuating signal.
Check out this poster from APS DPP 2025 or this preprint for more information.
Example Demonstration
Expected processing time:
- V100: < 0.5 seconds on any size spectrogram after warmup.
- CPU: ~5-10 seconds.
Quickstart
pip install tokeye # or: uv tool install tokeye
tokeye app # opens web app on http://localhost:7860
- The default model downloads automatically from Hugging Face on first use (~30 MB, cached — no manual setup).
- No data handy? Click "Load Example Signal" in the app, or generate one from the shell with
tokeye example. pip installrequires Python >= 3.13;uvx/uv tool installfetch a compatible Python automatically.
Zero-install trial: uvx tokeye app runs the app without installing anything into your environment.
Batch processing (CLI)
For headless / scripted use (no browser needed), run inference directly:
tokeye run "shots/*.npy" --output-dir results
INPUT arguments can be files, directories (all *.npy files inside are used), or quoted glob patterns. Each input is interpreted by its shape:
- 1D array — a raw time series. TokEye computes its STFT spectrogram using the flags below before running inference.
- 2D array — a precomputed spectrogram, fed to the model directly.
For each input file, tokeye run writes:
<stem>_mask.npy— float32 array, shape(2, H, W), sigmoid scores per pixel (channel 0 = coherent, channel 1 = transient).<stem>_preview.png— a grayscale spectrogram with the mask overlaid (green = coherent, red = transient), unless--no-pngis passed.
The process exit code is the number of files that failed.
Flags:
| Flag | Default | Description |
|---|---|---|
--model |
big_tf_unet |
Registry name or path to a .pt/.pt2 checkpoint. |
--output-dir |
tokeye_output |
Directory for masks and previews. |
--n-fft |
1024 |
STFT window size (1D inputs only). |
--hop |
256 |
STFT hop size (1D inputs only). |
--keep-dc |
off | Keep the DC bin (dropped by default). |
--clip-low / --clip-high |
1.0 / 99.0 |
Percentile clip bounds applied to the spectrogram. |
--threshold |
0.5 |
Mask threshold used only for the preview PNG overlay. |
--no-png |
off | Skip preview PNGs; write masks only. |
--device |
auto |
cpu, cuda, or auto. |
The released model was trained on spectrograms built with hop=128; for closest match to the training configuration use --hop 128.
On HPC clusters where compute nodes have no internet access, pre-fetch the weights on the login node, then run the batch job on the compute node:
tokeye download big_tf_unet # on the login node; prints the cached path
tokeye run ... --model big_tf_unet # on the compute node — model is already cached
Web app guide
tokeye app (or python -m tokeye.app) launches a Gradio interface with three tabs:
- Analyze — load a signal, compute its spectrogram, run a model, and visualize the result. Guided for first-time use: the model dropdown defaults to the bundled
big_tf_unetmodel, the STFT transform has working defaults, and "Load Example Signal" generates a synthetic demo signal so a brand-new user needs zero files. "Analyze" runs the whole load-model → infer → visualize pipeline in one click. View modes: Original, Enhanced (percentile-clipped amplitude), Mask (thresholded model output), Amplitude. - Annotate — manually draw and save mask annotations over a read-only backdrop image.
- Utilities — audio-format conversion and
.npyfile inspection.
Flags: tokeye app [--port 7860] [--share] [--open] — --share creates a public Gradio link, --open opens a browser tab on launch.
If you're on a remote server (e.g. an HPC login node), forward the port over SSH instead of using --share:
ssh -L 7860:localhost:7860 user@remote
Then open http://localhost:7860 in your local browser.
Verified Datatypes
- DIII-D Fast Magnetics (cite)
- DIII-D CO2 Interferometer (cite)
- DIII-D Electron Cyclotron Emission (cite)
- DIII-D Beam Emission Spectroscopy (cite)
Evaluation
Recall Scores:
- TJII2021: 0.8254
- DCLDE2011 (Delphinus capensis): 0.7708
- DCLDE2011 (Delphinus delphis): 0.7953
With more data, comes better models. Please contribute to the project!
Installation (from source / development)
uv is the dev tool for this repo:
git clone git@github.com:PlasmaControl/TokEye.git
cd TokEye
uv sync # core deps
uv sync --dev # + pytest, ruff, etc.
uv sync --group train # + training deps (lightning, h5py, etc.)
This creates a .venv/; activate it with source .venv/bin/activate, or prefix commands with uv run.
Models
| Registry name | HF file | Description |
|---|---|---|
big_tf_unet |
big_tf_unet_251210.pt |
Transformer U-Net trained on multiscale (multiwindow, multihop) spectrograms. |
Weights are hosted on Hugging Face and download automatically the first time a registry name is used (cached in ~/.cache/huggingface). Override the source repo with the TOKEYE_HF_REPO environment variable.
To use a local checkpoint instead, put .pt/.pt2 files in a model/ directory (picked up by the app's model dropdown) or pass a path directly via --model PATH.
Input should be a tensor that has shape (B, 1, H, W) where B, H, and W can vary Output will be a tensor of shape (B, 2, H, W)
Best performance when spectrograms are oriented so that when they are plotted with matplotlib, the lowest frequency bin is oriented with the bottom when origin='lower'. Spectrograms should be standardized (mean = 0, std = 1). If baseline activity is very strong, clipping the input may help, but is generally not needed.
The first channel of the output will return preferential measurements of coherent activity (useful for most tasks) The second channel of the output will return preferential measurements of transient activity
Data
Keep signals as 1D numpy float arrays (raw time series) — no need to normalize or preprocess them. The CLI also accepts 2D arrays (precomputed spectrograms) directly. The app scans a signal directory for .npy files (default data/input, configurable in the Analyze tab).
Bringing your own data takes two lines:
import numpy as np
signal = ... # any 1D float array: tokamak diagnostic, hydrophone, etc.
np.save("shots/myshot.npy", signal)
tokeye run shots/myshot.npy --output-dir results
No data yet? tokeye example writes a synthetic demo signal you can run immediately, and the web app has a matching "Load Example Signal" button.
Development
uv sync --dev
uv run ruff check .
uv run pytest
Citation
If you use this code in your research, please cite:
@article{chen_TokEye_2026,
title={TokEye: Fast Signal Extraction for Fluctuating Time Series via Offline Self-Supervised Learning From Fusion Diagnostics to Bioacoustics},
author={Chen, Nathaniel},
year={2026},
publisher={ArXiv},
doi={10.48550/arXiv.2602.20317},
url={https://www.arxiv.org/abs/2602.20317}
}
Contact
Nathaniel Chen — nathaniel [at] princeton [dot] edu — https://nathanielchen.net
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokeye-0.10.0.tar.gz.
File metadata
- Download URL: tokeye-0.10.0.tar.gz
- Upload date:
- Size: 3.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Red Hat Enterprise Linux","version":"8.10","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
35b8fba6b9e27b7f132b125f7a09ee49648118b370f900357c6e1c8406942047
|
|
| MD5 |
872c47001535a7013bc23021604c1349
|
|
| BLAKE2b-256 |
51da8e7c7caf78b02f25f5e728993103330d85f9d215864b88a5696232f19352
|
File details
Details for the file tokeye-0.10.0-py3-none-any.whl.
File metadata
- Download URL: tokeye-0.10.0-py3-none-any.whl
- Upload date:
- Size: 3.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Red Hat Enterprise Linux","version":"8.10","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c72a96fdd51f7f1b9a0a2c3d24222fe2bf71e1f0f27da15cc50e0ee1264aef9c
|
|
| MD5 |
d27180f88186bc65455b2ccae9e38f37
|
|
| BLAKE2b-256 |
a4e5ddaff7a1238b503acad0fc9c9d75a4abbcc198a3df5f1e84aa110f9cd235
|