Skip to main content

Detect and redact PHI/PII from audio/video recordings, built on nophi

Project description

nophi-av

Detect and redact PHI/PII in audio/video recordings, built on top of nophi.

nophi-av is an implementation of the MedVidDeID pipeline with nophi as the text PHI-detection engine (in place of philter-ucsf). It exposes the same CLI shape as nophinophi-av scan … instead of nophi scan ….

Pipeline

For each input recording:

  1. Extract audio — for video, the audio track is demuxed to 16 kHz mono WAV (audio inputs are used directly).
  2. TranscribeWhisperX produces a transcript with word-level timestamps (Whisper + forced alignment).
  3. Detect PHInophi's text engine (build_analyzer / scan_text) runs over the transcript, so every nophi recognizer (names, drugs, addresses, dates, …) applies automatically.
  4. Map spans → time — each detected character span is traced back to the words it covers, yielding a media time interval (see nophi_av/detect.py).
  5. Scrub audio — each PHI interval is replaced with a beep (default) or silence (--redact-mode), preserving duration so video stays in sync.
  6. Mask faces (video) — every face found by the detector is masked per --mask-mode (box solid fill by default, pixelate, blur, or none to skip), then the scrubbed audio is remuxed back onto the masked video. The detector is swappable via --face-model — YOLO by default, SCRFD as an alternative (see Face detection models).

Outputs per file: the redacted media (<name>_redacted.<ext>), a redacted transcript (<name>_transcript.txt), and a styled Excel findings report with entity type, replacement, and time interval.

Only the audio/transcript path uses nophi. Face masking is a separate visual step that doesn't touch nophi — it masks every detected face (detection, not recognition), the conservative choice for PHI removal.

Supported formats

  • Audio: .mp3 .wav .m4a .flac .aac .ogg
  • Video: .mp4 .mov .avi .mkv .webm

Install

A single install includes everything — audio scrubbing and video face masking:

pip install -e packages/nophi     # the text engine it depends on
pip install -e packages/nophi-av  # everything: whisperx, torch, pydub, ultralytics, opencv, pyannote

To use the SCRFD face detector (--face-model scrfd), install the optional extra, which adds insightface + onnxruntime:

pip install -e 'packages/nophi-av[scrfd]'

This is a heavy install — torch, ultralytics, and opencv are all pulled in. ffmpeg must also be on your PATH (e.g. brew install ffmpeg).

The first scan downloads model weights (Whisper, the alignment model, nophi's spaCy/scispaCy models, and the YOLO face model). Warm every cache up front with:

nophi-av download-models --whisper-model small

Requirements & caveats

  • ffmpeg must be on PATH (e.g. brew install ffmpeg). WhisperX uses it to decode audio, and pydub uses it to encode .mp3/.m4a output. .wav in/out avoids any external ffmpeg dependency.
  • Face model — see Face detection models below for the available detectors and how --face-model resolves them. Use --mask-mode none to skip face masking entirely.

Face detection models

Face masking is modular: the pipeline only consumes per-frame bounding boxes, so the detector behind --face-model is swappable. Run nophi-av list-models to see the catalog, or pick from:

--face-model Backend Notes
yolov12s-face.pt (default) YOLO (Ultralytics) Community YOLOv12-s face weight — balanced.
yolov8n-face.pt YOLO Smallest/fastest, lower recall.
yolov11m-face.pt YOLO Larger, higher recall.
scrfd SCRFD (insightface) SCRFD-10G — strong on small/tilted/profile faces.
scrfd-s SCRFD Lighter SCRFD-500M variant.

How a spec is resolved:

  • YOLO — the default backend. Community face weights (yolov{8,10,11,12}{n,s,m,l}-face.pt) aren't official Ultralytics assets, so nophi-av fetches them from the yolo-face release and caches them in Ultralytics' weights dir on first use. You can also pass a local checkpoint path or an official Ultralytics name (e.g. yolov8n.pt, which detects people, not faces). Requires ultralytics>=8.3 for the YOLOv12 architecture (included).
  • SCRFD — selected by --face-model scrfd (or scrfd-s). insightface downloads and caches the model pack on first use. Needs the optional extra: pip install 'nophi-av[scrfd]'.

For PHI de-identification, recall matters more than precision (a missed face is a leak; an over-masked box is harmless), so SCRFD is a good upgrade when small or non-frontal faces are being missed.

Adding your own backend: subclass FaceDetector in nophi_av/detectors.py, register it in _BACKENDS, and add a ModelInfo entry to FACE_MODELS so it appears in list-models. Nothing in video.py needs to change — it only calls detector.detect(frame, conf).

Usage

# Audio: bleep spoken PHI, write redacted audio + transcript + report
nophi-av scan interview.mp3 -o ./cleaned

# Audio, mute (instead of beep) PHI, larger Whisper model for accuracy
nophi-av scan interview.wav --redact-mode silence --whisper-model medium -o ./cleaned

# A whole folder; detect only names & dates; report without writing media
nophi-av scan ./recordings -e PERSON,DATE_TIME --dry-run

# Map known participants to stable IDs instead of <PERSON>
nophi-av scan interview.mp3 -m participants.csv -o ./cleaned

# Video: black-box faces + bleep spoken PHI (box is the default)
nophi-av scan visit.mp4 -o ./cleaned   # uses default yolov12s-face (auto-downloaded)

# Video, pixelate faces instead of a solid box
nophi-av scan visit.mp4 --mask-mode pixelate -o ./cleaned

# Video, use the SCRFD detector instead of YOLO (needs the [scrfd] extra)
nophi-av scan visit.mp4 --face-model scrfd -o ./cleaned

# Video, audio scrub + remux only (skip face masking)
nophi-av scan visit.mp4 --mask-mode none -o ./cleaned

# See the available face-detection models
nophi-av list-models

Bare invocation works too: nophi-av interview.mp3 is shorthand for nophi-av scan interview.mp3.

Example

Scanning a ~49s clinical-intake recording with --whisper-model base on CPU:

Entity type    Count   Examples (with timestamps)
-----------    -----   --------------------------
PERSON           2     James Carter (00:17.7), Emily Rivera (00:24.7)
DATE_TIME        1     October 24th, 1999 (00:33.8)
LOCATION/ORG     1     455 University Avenue (00:41.8)

cleaned/interview_redacted.mp3 (PHI windows beeped, duration preserved), cleaned/interview_transcript.txt ("My name is Dr. <PERSON> … I live at <ORGANIZATION>, apartment 3B."), and phi_av_report.xlsx.

Key options

Option Description
--redact-mode beep|silence How to remove PHI from audio (default beep)
--beep-hz Beep tone frequency (default 1000)
--mask-mode box|pixelate|blur|none How to mask faces in video (default box; none skips)
--face-model Face detector: a YOLO weight name/path or scrfd (see list-models)
--whisper-model WhisperX size: tiny/base/small/medium/large-v3
--device auto|cpu|cuda Compute device for ASR / face masking
--padding Seconds added around each bleeped interval (default 0.15)
--entities, --mappings, --exclude, --dry-run Same semantics as nophi

Development

The transcript↔time mapping core (nophi_av/detect.py) is ML-free and unit tested without torch/whisperx:

pytest packages/nophi-av/tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nophi_av-0.1.1.tar.gz (31.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nophi_av-0.1.1-py3-none-any.whl (29.9 kB view details)

Uploaded Python 3

File details

Details for the file nophi_av-0.1.1.tar.gz.

File metadata

  • Download URL: nophi_av-0.1.1.tar.gz
  • Upload date:
  • Size: 31.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for nophi_av-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ce4d64e37b751ebeb0034a5e18018457ea642f548541bf4ca4bab685d4320bce
MD5 099f07be2e3c17879c9b356b7e47271c
BLAKE2b-256 202d4911685cb58f7898a24cea9e3566869eed6fd404d7f3641225482f4c57ef

See more details on using hashes here.

File details

Details for the file nophi_av-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: nophi_av-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for nophi_av-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f58abb42e8f1d04e933ceaa094520831db4949252ae8a3953fa1e86fdfb1adb9
MD5 682e0264f6811a932e9d112567105472
BLAKE2b-256 8a4cc63b0632dfa7d29807612065a049303b2c6bf2138b4d43a2de6fe074d38d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page