Detect and redact PHI/PII from audio/video recordings, built on nophi
Project description
nophi-av
Detect and redact PHI/PII in audio/video recordings, built on top of
nophi.
nophi-av is an implementation of the MedVidDeID
pipeline with nophi as the text PHI-detection engine (in place of philter-ucsf).
It exposes the same CLI shape as nophi — nophi-av scan … instead of
nophi scan ….
Pipeline
For each input recording:
- Extract audio — for video, the audio track is demuxed to 16 kHz mono WAV (audio inputs are used directly).
- Transcribe — WhisperX produces a transcript with word-level timestamps (Whisper + forced alignment).
- Detect PHI —
nophi's text engine (build_analyzer/scan_text) runs over the transcript, so everynophirecognizer (names, drugs, addresses, dates, …) applies automatically. - Map spans → time — each detected character span is traced back to the
words it covers, yielding a media time interval (see
nophi_av/detect.py). - Scrub audio — each PHI interval is replaced with a beep (default) or
silence (
--redact-mode), preserving duration so video stays in sync. - Mask faces (video) — every face found by the detector is masked per
--mask-mode(boxsolid fill by default,pixelate,blur, ornoneto skip), then the scrubbed audio is remuxed back onto the masked video. The detector is swappable via--face-model— YOLO by default, SCRFD as an alternative (see Face detection models).
Outputs per file: the redacted media (<name>_redacted.<ext>), a redacted
transcript (<name>_transcript.txt), and a styled Excel findings report with
entity type, replacement, and time interval.
Only the audio/transcript path uses
nophi. Face masking is a separate visual step that doesn't touchnophi— it masks every detected face (detection, not recognition), the conservative choice for PHI removal.
Supported formats
- Audio:
.mp3 .wav .m4a .flac .aac .ogg - Video:
.mp4 .mov .avi .mkv .webm
Install
A single install includes everything — audio scrubbing and video face masking:
pip install -e packages/nophi # the text engine it depends on
pip install -e packages/nophi-av # everything: whisperx, torch, pydub, ultralytics, opencv, pyannote
To use the SCRFD face detector (--face-model scrfd), install the optional
extra, which adds insightface + onnxruntime:
pip install -e 'packages/nophi-av[scrfd]'
This is a heavy install —
torch,ultralytics, andopencvare all pulled in. ffmpeg must also be on yourPATH(e.g.brew install ffmpeg).
The first scan downloads model weights (Whisper, the alignment model, nophi's spaCy/scispaCy models, and the YOLO face model). Warm every cache up front with:
nophi-av download-models --whisper-model small
Requirements & caveats
- ffmpeg must be on
PATH(e.g.brew install ffmpeg). WhisperX uses it to decode audio, and pydub uses it to encode.mp3/.m4aoutput..wavin/out avoids any external ffmpeg dependency. - Face model — see Face detection models below for
the available detectors and how
--face-modelresolves them. Use--mask-mode noneto skip face masking entirely.
Face detection models
Face masking is modular: the pipeline only consumes per-frame bounding boxes,
so the detector behind --face-model is swappable. Run nophi-av list-models
to see the catalog, or pick from:
--face-model |
Backend | Notes |
|---|---|---|
yolov12s-face.pt (default) |
YOLO (Ultralytics) | Community YOLOv12-s face weight — balanced. |
yolov8n-face.pt |
YOLO | Smallest/fastest, lower recall. |
yolov11m-face.pt |
YOLO | Larger, higher recall. |
scrfd |
SCRFD (insightface) | SCRFD-10G — strong on small/tilted/profile faces. |
scrfd-s |
SCRFD | Lighter SCRFD-500M variant. |
How a spec is resolved:
- YOLO — the default backend. Community face weights (
yolov{8,10,11,12}{n,s,m,l}-face.pt) aren't official Ultralytics assets, so nophi-av fetches them from theyolo-facerelease and caches them in Ultralytics' weights dir on first use. You can also pass a local checkpoint path or an official Ultralytics name (e.g.yolov8n.pt, which detects people, not faces). Requiresultralytics>=8.3for the YOLOv12 architecture (included). - SCRFD — selected by
--face-model scrfd(orscrfd-s). insightface downloads and caches the model pack on first use. Needs the optional extra:pip install 'nophi-av[scrfd]'.
For PHI de-identification, recall matters more than precision (a missed face is a leak; an over-masked box is harmless), so SCRFD is a good upgrade when small or non-frontal faces are being missed.
Adding your own backend: subclass FaceDetector in
nophi_av/detectors.py, register it in _BACKENDS, and
add a ModelInfo entry to FACE_MODELS so it appears in list-models. Nothing
in video.py needs to change — it only calls detector.detect(frame, conf).
Usage
# Audio: bleep spoken PHI, write redacted audio + transcript + report
nophi-av scan interview.mp3 -o ./cleaned
# Audio, mute (instead of beep) PHI, larger Whisper model for accuracy
nophi-av scan interview.wav --redact-mode silence --whisper-model medium -o ./cleaned
# A whole folder; detect only names & dates; report without writing media
nophi-av scan ./recordings -e PERSON,DATE_TIME --dry-run
# Map known participants to stable IDs instead of <PERSON>
nophi-av scan interview.mp3 -m participants.csv -o ./cleaned
# Video: black-box faces + bleep spoken PHI (box is the default)
nophi-av scan visit.mp4 -o ./cleaned # uses default yolov12s-face (auto-downloaded)
# Video, pixelate faces instead of a solid box
nophi-av scan visit.mp4 --mask-mode pixelate -o ./cleaned
# Video, use the SCRFD detector instead of YOLO (needs the [scrfd] extra)
nophi-av scan visit.mp4 --face-model scrfd -o ./cleaned
# Video, audio scrub + remux only (skip face masking)
nophi-av scan visit.mp4 --mask-mode none -o ./cleaned
# See the available face-detection models
nophi-av list-models
Bare invocation works too: nophi-av interview.mp3 is shorthand for
nophi-av scan interview.mp3.
Example
Scanning a ~49s clinical-intake recording with --whisper-model base on CPU:
Entity type Count Examples (with timestamps)
----------- ----- --------------------------
PERSON 2 James Carter (00:17.7), Emily Rivera (00:24.7)
DATE_TIME 1 October 24th, 1999 (00:33.8)
LOCATION/ORG 1 455 University Avenue (00:41.8)
→ cleaned/interview_redacted.mp3 (PHI windows beeped, duration preserved),
cleaned/interview_transcript.txt ("My name is Dr. <PERSON> … I live at
<ORGANIZATION>, apartment 3B."), and phi_av_report.xlsx.
Key options
| Option | Description |
|---|---|
--redact-mode beep|silence |
How to remove PHI from audio (default beep) |
--beep-hz |
Beep tone frequency (default 1000) |
--mask-mode box|pixelate|blur|none |
How to mask faces in video (default box; none skips) |
--face-model |
Face detector: a YOLO weight name/path or scrfd (see list-models) |
--whisper-model |
WhisperX size: tiny/base/small/medium/large-v3 |
--device auto|cpu|cuda |
Compute device for ASR / face masking |
--padding |
Seconds added around each bleeped interval (default 0.15) |
--entities, --mappings, --exclude, --dry-run |
Same semantics as nophi |
Development
The transcript↔time mapping core (nophi_av/detect.py) is ML-free and unit
tested without torch/whisperx:
pytest packages/nophi-av/tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nophi_av-0.1.1.tar.gz.
File metadata
- Download URL: nophi_av-0.1.1.tar.gz
- Upload date:
- Size: 31.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce4d64e37b751ebeb0034a5e18018457ea642f548541bf4ca4bab685d4320bce
|
|
| MD5 |
099f07be2e3c17879c9b356b7e47271c
|
|
| BLAKE2b-256 |
202d4911685cb58f7898a24cea9e3566869eed6fd404d7f3641225482f4c57ef
|
File details
Details for the file nophi_av-0.1.1-py3-none-any.whl.
File metadata
- Download URL: nophi_av-0.1.1-py3-none-any.whl
- Upload date:
- Size: 29.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f58abb42e8f1d04e933ceaa094520831db4949252ae8a3953fa1e86fdfb1adb9
|
|
| MD5 |
682e0264f6811a932e9d112567105472
|
|
| BLAKE2b-256 |
8a4cc63b0632dfa7d29807612065a049303b2c6bf2138b4d43a2de6fe074d38d
|