Detect and redact PHI/PII from audio/video recordings, built on nophi

These details have not been verified by PyPI

Project links

Project description

nophi-av

Detect and redact PHI/PII in audio/video recordings, built on top of nophi.

nophi-av is an implementation of the MedVidDeID pipeline with nophi as the text PHI-detection engine (in place of philter-ucsf). It exposes the same CLI shape as nophi — nophi-av scan … instead of nophi scan ….

Pipeline

For each input recording:

Extract audio — for video, the audio track is demuxed to 16 kHz mono WAV (audio inputs are used directly).
Transcribe — WhisperX produces a transcript with word-level timestamps (Whisper + forced alignment). Optional speaker diarization with --hf-token.
Detect PHI — nophi's text engine (build_analyzer / scan_text) runs over the transcript, so every nophi recognizer (names, drugs, addresses, dates, …) applies automatically.
Map spans → time — each detected character span is traced back to the words it covers, yielding a media time interval (see nophi_av/detect.py).
Scrub audio — each PHI interval is replaced with a beep (default) or silence (--redact-mode), preserving duration so video stays in sync.
Mask faces (video) — every face found by the detector is masked per --mask-mode (box solid fill by default, pixelate, blur, or none to skip), then the scrubbed audio is remuxed back onto the masked video. The detector is swappable via --face-model — YOLO by default, SCRFD as an alternative (see Face detection models).

Outputs per file: the redacted media (<name>_redacted.<ext>), a redacted transcript (<name>_transcript.txt), and a styled Excel findings report with entity type, replacement, time interval, and speaker.

Only the audio/transcript path uses nophi. Face masking is a separate visual step that doesn't touch nophi — it masks every detected face (detection, not recognition), the conservative choice for PHI removal.

Supported formats

Audio: .mp3 .wav .m4a .flac .aac .ogg
Video: .mp4 .mov .avi .mkv .webm

Install

A single install includes everything — audio scrubbing, video face masking, and speaker diarization:

pip install -e packages/nophi     # the text engine it depends on
pip install -e packages/nophi-av  # everything: whisperx, torch, pydub, ultralytics, opencv, pyannote

To use the SCRFD face detector (--face-model scrfd), install the optional extra, which adds insightface + onnxruntime:

pip install -e 'packages/nophi-av[scrfd]'

This is a heavy install — torch, ultralytics, and opencv are all pulled in. ffmpeg must also be on your PATH (e.g. brew install ffmpeg).

The first scan downloads model weights (Whisper, the alignment model, nophi's spaCy/scispaCy models, and the YOLO face model). Warm every cache up front with:

nophi-av download-models --whisper-model small

Requirements & caveats

ffmpeg must be on PATH (e.g. brew install ffmpeg). WhisperX uses it to decode audio, and pydub uses it to encode .mp3/.m4a output. .wav in/out avoids any external ffmpeg dependency.
Face model — see Face detection models below for the available detectors and how --face-model resolves them. Use --mask-mode none to skip face masking entirely.

Face detection models

Face masking is modular: the pipeline only consumes per-frame bounding boxes, so the detector behind --face-model is swappable. Run nophi-av list-models to see the catalog, or pick from:

`--face-model`	Backend	Notes
`yolov12s-face.pt` (default)	YOLO (Ultralytics)	Community YOLOv12-s face weight — balanced.
`yolov8n-face.pt`	YOLO	Smallest/fastest, lower recall.
`yolov11m-face.pt`	YOLO	Larger, higher recall.
`scrfd`	SCRFD (insightface)	SCRFD-10G — strong on small/tilted/profile faces.
`scrfd-s`	SCRFD	Lighter SCRFD-500M variant.

How a spec is resolved:

YOLO — the default backend. Community face weights (yolov{8,10,11,12}{n,s,m,l}-face.pt) aren't official Ultralytics assets, so nophi-av fetches them from the yolo-face release and caches them in Ultralytics' weights dir on first use. You can also pass a local checkpoint path or an official Ultralytics name (e.g. yolov8n.pt, which detects people, not faces). Requires ultralytics>=8.3 for the YOLOv12 architecture (included).
SCRFD — selected by --face-model scrfd (or scrfd-s). insightface downloads and caches the model pack on first use. Needs the optional extra: pip install 'nophi-av[scrfd]'.

For PHI de-identification, recall matters more than precision (a missed face is a leak; an over-masked box is harmless), so SCRFD is a good upgrade when small or non-frontal faces are being missed.

Adding your own backend: subclass FaceDetector in nophi_av/detectors.py, register it in _BACKENDS, and add a ModelInfo entry to FACE_MODELS so it appears in list-models. Nothing in video.py needs to change — it only calls detector.detect(frame, conf).

Usage

# Audio: bleep spoken PHI, write redacted audio + transcript + report
nophi-av scan interview.mp3 -o ./cleaned

# Audio, mute (instead of beep) PHI, larger Whisper model for accuracy
nophi-av scan interview.wav --redact-mode silence --whisper-model medium -o ./cleaned

# A whole folder; detect only names & dates; report without writing media
nophi-av scan ./recordings -e PERSON,DATE_TIME --dry-run

# Map known participants to stable IDs instead of <PERSON>
nophi-av scan interview.mp3 -m participants.csv -o ./cleaned

# Video: black-box faces + bleep spoken PHI (box is the default)
nophi-av scan visit.mp4 -o ./cleaned   # uses default yolov12s-face (auto-downloaded)

# Video, pixelate faces instead of a solid box
nophi-av scan visit.mp4 --mask-mode pixelate -o ./cleaned

# Video, use the SCRFD detector instead of YOLO (needs the [scrfd] extra)
nophi-av scan visit.mp4 --face-model scrfd -o ./cleaned

# Video, audio scrub + remux only (skip face masking)
nophi-av scan visit.mp4 --mask-mode none -o ./cleaned

# See the available face-detection models
nophi-av list-models

Bare invocation works too: nophi-av interview.mp3 is shorthand for nophi-av scan interview.mp3.

Example

Scanning a ~49s clinical-intake recording with --whisper-model base on CPU:

Entity type    Count   Examples (with timestamps)
-----------    -----   --------------------------
PERSON           2     James Carter (00:17.7), Emily Rivera (00:24.7)
DATE_TIME        1     October 24th, 1999 (00:33.8)
LOCATION/ORG     1     455 University Avenue (00:41.8)

→ cleaned/interview_redacted.mp3 (PHI windows beeped, duration preserved), cleaned/interview_transcript.txt ("My name is Dr. <PERSON> … I live at <ORGANIZATION>, apartment 3B."), and phi_av_report.xlsx.

Key options

Option	Description
`--redact-mode beep\|silence`	How to remove PHI from audio (default `beep`)
`--beep-hz`	Beep tone frequency (default 1000)
`--mask-mode box\|pixelate\|blur\|none`	How to mask faces in video (default `box`; `none` skips)
`--face-model`	Face detector: a YOLO weight name/path or `scrfd` (see `list-models`)
`--whisper-model`	WhisperX size: `tiny/base/small/medium/large-v3`
`--device auto\|cpu\|cuda`	Compute device for ASR / face masking
`--hf-token`	Enable speaker diarization (needs a HuggingFace token)
`--padding`	Seconds added around each bleeped interval (default 0.15)
`--entities`, `--mappings`, `--exclude`, `--dry-run`	Same semantics as `nophi`

Development

The transcript↔time mapping core (nophi_av/detect.py) is ML-free and unit tested without torch/whisperx:

pytest packages/nophi-av/tests

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.1

Jun 30, 2026

This version

0.1.0 yanked

Jun 29, 2026

Reason this release was yanked:

wrong dependency versions

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nophi_av-0.1.0.tar.gz (31.2 kB view details)

Uploaded Jun 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nophi_av-0.1.0-py3-none-any.whl (30.5 kB view details)

Uploaded Jun 29, 2026 Python 3

File details

Details for the file nophi_av-0.1.0.tar.gz.

File metadata

Download URL: nophi_av-0.1.0.tar.gz
Upload date: Jun 29, 2026
Size: 31.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for nophi_av-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ea87400e279eb1c773c7405c7bce91df0231e2f44f68fc2633858218a8bb13cc`
MD5	`32396a67f600452b8ce7f611ef34eb14`
BLAKE2b-256	`632fd63281f25701e33d6095db020be5e69fb6729b1d592fe1053d0acea6766e`

See more details on using hashes here.

File details

Details for the file nophi_av-0.1.0-py3-none-any.whl.

File metadata

Download URL: nophi_av-0.1.0-py3-none-any.whl
Upload date: Jun 29, 2026
Size: 30.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for nophi_av-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`15503dd5f9ef09dd22a25e8c2f82abd6dcbb4083af0a458dd9964971bb3e73e3`
MD5	`4757fc6e882f2f35639e1478159cb020`
BLAKE2b-256	`47e17526044353f0efa7647318c56ba1109cc99c9573643ad7453aecefbc26c2`

See more details on using hashes here.

nophi-av 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

nophi-av

Pipeline

Supported formats

Install

Requirements & caveats

Face detection models

Usage

Example

Key options

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes