Automatic classification and localization of fluctuating signals in spectrograms

Project description

TokEye Logo

TokEye

TokEye is a open-source Python-based application for automatic classification and localization of fluctuating signals. It is designed to be used in the context of plasma physics, but can be used for any type of fluctuating signal.

Check out this poster from APS DPP 2025 or this preprint for more information.

Example Demonstration

Expected processing time:

V100: < 0.5 seconds on any size spectrogram after warmup.
CPU: ~5-10 seconds.

Quickstart

pip install tokeye   # or: uv tool install tokeye
tokeye app           # opens web app on http://localhost:7860

The default model downloads automatically from Hugging Face on first use (~30 MB, cached — no manual setup).
No data on hand? Click "Load Example Signal" in the app, or generate one from the shell with tokeye example.
pip install requires Python >= 3.13; uvx/uv tool install fetch a compatible Python automatically.

Zero-install trial: uvx tokeye app runs the app without installing anything into your environment.

Python API

To use TokEye inside your own program, import the TokEye class and call it — no configuration needed:

import numpy as np
from tokeye import TokEye

eye = TokEye()  # loads the default model (auto-downloads on first use)

mask = eye(signal)             # 1D time series → STFT → inference
mask = eye(spectrogram)        # 2D spectrogram → inference directly
coherent, transient = mask     # (2, H, W) sigmoid scores in [0, 1]

Input is auto-detected by shape: a 1D array is treated as a raw time series (TokEye computes the spectrogram), a 2D array as a ready spectrogram. Standardization happens internally — no preprocessing needed.

If your 2D spectrogram is stored in linear scale (raw STFT magnitude/power), pass log=True so TokEye applies log1p first — the model expects log-scaled input:

mask = eye(linear_spectrogram, log=True)      # per call
eye = TokEye(log=True)                        # or for every call

log is off by default and ignored for 1D inputs (the STFT already log-scales). Everything is configurable through the constructor, but the defaults just work:

eye = TokEye(
    model="big_tf_unet",   # registry name or path to a local .pt/.pt2
    device="auto",         # "cpu", "cuda", or "auto"
    n_fft=1024, hop=256,   # STFT settings (1D inputs only)
    clip_dc=True, clip_low=1.0, clip_high=99.0,
    log=False,             # log1p for linear-scale 2D spectrograms
)

Batch processing (CLI)

For headless / scripted use (no browser needed), run inference directly. For example:

tokeye run "files/*.npy" --output-dir results

INPUT arguments can be files, directories (all *.npy files inside are used), or quoted glob patterns. Each input is interpreted by its shape:

1D array — a raw time series. TokEye computes its STFT spectrogram using the flags below before running inference.
2D array — a precomputed spectrogram, fed to the model directly.

For each input file, tokeye run writes:

<stem>_mask.npy — float32 array, shape (2, H, W), sigmoid scores per pixel (channel 0 = coherent, channel 1 = transient).
<stem>_preview.png — a grayscale spectrogram with the mask overlaid (green = coherent, red = transient), unless --no-png is passed.

The process exit code is the number of files that failed.

Flags:

Flag	Default	Description
`--model`	`big_tf_unet`	Registry name or path to a `.pt`/`.pt2` checkpoint.
`--output-dir`	`tokeye_output`	Directory for masks and previews.
`--n-fft`	`1024`	STFT window size (1D inputs only).
`--hop`	`256`	STFT hop size (1D inputs only).
`--keep-dc`	off	Keep the DC bin (dropped by default).
`--clip-low` / `--clip-high`	`1.0` / `99.0`	Percentile clip bounds applied to the spectrogram.
`--log`	off	Apply `log1p` to 2D spectrogram inputs stored in linear scale (1D signals are always log-scaled during the STFT).
`--threshold`	`0.5`	Mask threshold used only for the preview PNG overlay.
`--no-png`	off	Skip preview PNGs; write masks only.
`--device`	`auto`	`cpu`, `cuda`, or `auto`.

The released model was trained on spectrograms built with hop=128; for closest match to the training configuration use --hop 128.

On HPC clusters where compute nodes have no internet access, pre-fetch the weights on the login node, then run the batch job on the compute node:

tokeye download big_tf_unet   # on the login node; prints the cached path
tokeye run ... --model big_tf_unet   # on the compute node — model is already cached

Web app guide

tokeye app (or python -m tokeye.app) launches a Gradio interface with three tabs:

Analyze — load a signal, compute its spectrogram, run a model, and visualize the result. Guided for first-time use: the model dropdown defaults to the bundled big_tf_unet model, the STFT transform has working defaults, and "Load Example Signal" generates a synthetic demo signal so a brand-new user needs zero files. "Analyze" runs the whole load-model → infer → visualize pipeline in one click. View modes: Original, Enhanced (percentile-clipped amplitude), Mask (thresholded model output), Amplitude.
Annotate — manually draw and save mask annotations over a read-only backdrop image.
Utilities — audio-format conversion and .npy file inspection.

Flags: tokeye app [--port 7860] [--share] [--open] — --share creates a public Gradio link, --open opens a browser tab on launch.

If you're on a remote server (e.g. an HPC login node), forward the port over SSH instead of using --share:

ssh -L 7860:localhost:7860 user@remote

Then open http://localhost:7860 in your local browser.

Verified Datatypes

DIII-D Fast Magnetics (cite)
DIII-D CO2 Interferometer (cite)
DIII-D Electron Cyclotron Emission (cite)
DIII-D Beam Emission Spectroscopy (cite)

Evaluation

Recall Scores:

TJII2021: 0.8254
DCLDE2011 (Delphinus capensis): 0.7708
DCLDE2011 (Delphinus delphis): 0.7953

With more data, comes better models. Please contribute to the project!

Installation (from source / development)

uv is the dev tool for this repo:

git clone git@github.com:PlasmaControl/TokEye.git
cd TokEye
uv sync             # core deps
uv sync --dev       # + pytest, ruff, etc.
uv sync --group train  # + training deps (lightning, h5py, etc.)

This creates a .venv/; activate it with source .venv/bin/activate, or prefix commands with uv run.

Models

Registry name	HF file	Description
`big_tf_unet`	`big_tf_unet_251210.pt`	Transformer U-Net trained on multiscale (multiwindow, multihop) spectrograms.

Weights are hosted on Hugging Face and download automatically the first time a registry name is used (cached in ~/.cache/huggingface). Override the source repo with the TOKEYE_HF_REPO environment variable.

To use a local checkpoint instead, put .pt/.pt2 files in a model/ directory (picked up by the app's model dropdown) or pass a path directly via --model PATH.

Input should be a tensor that has shape (B, 1, H, W) where B, H, and W can vary Output will be a tensor of shape (B, 2, H, W)

Best performance when spectrograms are oriented so that when they are plotted with matplotlib, the lowest frequency bin is oriented with the bottom when origin='lower'. Spectrograms should be standardized (mean = 0, std = 1). If baseline activity is very strong, clipping the input may help, but is generally not needed.

The first channel of the output will return preferential measurements of coherent activity (useful for most tasks) The second channel of the output will return preferential measurements of transient activity

Data

Keep signals as 1D numpy float arrays (raw time series) — no need to normalize or preprocess them. The CLI also accepts 2D arrays (precomputed spectrograms) directly. The app scans a signal directory for .npy files (default data/input, configurable in the Analyze tab).

Bringing your own data takes two lines:

import numpy as np

signal = ...  # any 1D float array: tokamak diagnostic, hydrophone, etc.
np.save("shots/myshot.npy", signal)

tokeye run shots/myshot.npy --output-dir results

No data yet? tokeye example writes a synthetic demo signal you can run immediately, and the web app has a matching "Load Example Signal" button.

Development

uv sync --dev
uv run ruff check .
uv run pytest

Citation

If you use this code in your research, please cite:

@article{chen_TokEye_2026,
  title={TokEye: Fast Signal Extraction for Fluctuating Time Series via Offline Self-Supervised Learning From Fusion Diagnostics to Bioacoustics},
  author={Chen, Nathaniel},
  year={2026},
  publisher={ArXiv},
  doi={10.48550/arXiv.2602.20317},
  url={https://www.arxiv.org/abs/2602.20317}
}

Contact

Nathaniel Chen — nathaniel [at] princeton [dot] edu — https://nathanielchen.net

Project details

Release history Release notifications | RSS feed

This version

0.11.0

Jul 5, 2026

0.10.0

Jul 5, 2026

0.9.5

Jun 22, 2026

0.9.0

Dec 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokeye-0.11.0.tar.gz (3.2 MB view details)

Uploaded Jul 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tokeye-0.11.0-py3-none-any.whl (3.2 MB view details)

Uploaded Jul 5, 2026 Python 3

File details

Details for the file tokeye-0.11.0.tar.gz.

File metadata

Download URL: tokeye-0.11.0.tar.gz
Upload date: Jul 5, 2026
Size: 3.2 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokeye-0.11.0.tar.gz
Algorithm	Hash digest
SHA256	`60799cb54449e2270508469ca698e50657b72945feb7595eae38a3437334ed5f`
MD5	`c3931f9af1b61147b6ecddd45374ca97`
BLAKE2b-256	`624d5a2b24907791b9f3a18733f30fc648330a2ca80825d01f7f53d02dd10c4a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokeye-0.11.0.tar.gz:

Publisher: python-publish.yml on PlasmaControl/tokeye

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tokeye-0.11.0.tar.gz
- Subject digest: 60799cb54449e2270508469ca698e50657b72945feb7595eae38a3437334ed5f
- Sigstore transparency entry: 2074022062
- Sigstore integration time: Jul 5, 2026
Source repository:
- Permalink: PlasmaControl/tokeye@204bac291aed9ec89166c08707160cf3af16a3ea
- Branch / Tag: refs/tags/v0.11.0
- Owner: https://github.com/PlasmaControl
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@204bac291aed9ec89166c08707160cf3af16a3ea
- Trigger Event: release

File details

Details for the file tokeye-0.11.0-py3-none-any.whl.

File metadata

Download URL: tokeye-0.11.0-py3-none-any.whl
Upload date: Jul 5, 2026
Size: 3.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokeye-0.11.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f010e431c481040a76ed6c4f4f56f857b451e43d0b3e82385db7c821bb91f426`
MD5	`c6edeca4f58bc70af8cf45632547a554`
BLAKE2b-256	`4c98f868d569ab6bdf294cf1af29aae2def44985efb82e6875e455904527be38`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokeye-0.11.0-py3-none-any.whl:

Publisher: python-publish.yml on PlasmaControl/tokeye

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tokeye-0.11.0-py3-none-any.whl
- Subject digest: f010e431c481040a76ed6c4f4f56f857b451e43d0b3e82385db7c821bb91f426
- Sigstore transparency entry: 2074022094
- Sigstore integration time: Jul 5, 2026
Source repository:
- Permalink: PlasmaControl/tokeye@204bac291aed9ec89166c08707160cf3af16a3ea
- Branch / Tag: refs/tags/v0.11.0
- Owner: https://github.com/PlasmaControl
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@204bac291aed9ec89166c08707160cf3af16a3ea
- Trigger Event: release

tokeye 0.11.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

TokEye

Example Demonstration

Quickstart

Python API

Batch processing (CLI)

Web app guide

Verified Datatypes

Evaluation

Installation (from source / development)

Models

Data

Development

Citation

Contact

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance