Skip to main content

Detect and redact PHI/PII from audio/video recordings, built on nophi

Reason this release was yanked:

wrong dependency versions

Project description

nophi-av

Detect and redact PHI/PII in audio/video recordings, built on top of nophi.

nophi-av is an implementation of the MedVidDeID pipeline with nophi as the text PHI-detection engine (in place of philter-ucsf). It exposes the same CLI shape as nophinophi-av scan … instead of nophi scan ….

Pipeline

For each input recording:

  1. Extract audio — for video, the audio track is demuxed to 16 kHz mono WAV (audio inputs are used directly).
  2. TranscribeWhisperX produces a transcript with word-level timestamps (Whisper + forced alignment). Optional speaker diarization with --hf-token.
  3. Detect PHInophi's text engine (build_analyzer / scan_text) runs over the transcript, so every nophi recognizer (names, drugs, addresses, dates, …) applies automatically.
  4. Map spans → time — each detected character span is traced back to the words it covers, yielding a media time interval (see nophi_av/detect.py).
  5. Scrub audio — each PHI interval is replaced with a beep (default) or silence (--redact-mode), preserving duration so video stays in sync.
  6. Mask faces (video) — every face found by the detector is masked per --mask-mode (box solid fill by default, pixelate, blur, or none to skip), then the scrubbed audio is remuxed back onto the masked video. The detector is swappable via --face-model — YOLO by default, SCRFD as an alternative (see Face detection models).

Outputs per file: the redacted media (<name>_redacted.<ext>), a redacted transcript (<name>_transcript.txt), and a styled Excel findings report with entity type, replacement, time interval, and speaker.

Only the audio/transcript path uses nophi. Face masking is a separate visual step that doesn't touch nophi — it masks every detected face (detection, not recognition), the conservative choice for PHI removal.

Supported formats

  • Audio: .mp3 .wav .m4a .flac .aac .ogg
  • Video: .mp4 .mov .avi .mkv .webm

Install

A single install includes everything — audio scrubbing, video face masking, and speaker diarization:

pip install -e packages/nophi     # the text engine it depends on
pip install -e packages/nophi-av  # everything: whisperx, torch, pydub, ultralytics, opencv, pyannote

To use the SCRFD face detector (--face-model scrfd), install the optional extra, which adds insightface + onnxruntime:

pip install -e 'packages/nophi-av[scrfd]'

This is a heavy install — torch, ultralytics, and opencv are all pulled in. ffmpeg must also be on your PATH (e.g. brew install ffmpeg).

The first scan downloads model weights (Whisper, the alignment model, nophi's spaCy/scispaCy models, and the YOLO face model). Warm every cache up front with:

nophi-av download-models --whisper-model small

Requirements & caveats

  • ffmpeg must be on PATH (e.g. brew install ffmpeg). WhisperX uses it to decode audio, and pydub uses it to encode .mp3/.m4a output. .wav in/out avoids any external ffmpeg dependency.
  • Face model — see Face detection models below for the available detectors and how --face-model resolves them. Use --mask-mode none to skip face masking entirely.

Face detection models

Face masking is modular: the pipeline only consumes per-frame bounding boxes, so the detector behind --face-model is swappable. Run nophi-av list-models to see the catalog, or pick from:

--face-model Backend Notes
yolov12s-face.pt (default) YOLO (Ultralytics) Community YOLOv12-s face weight — balanced.
yolov8n-face.pt YOLO Smallest/fastest, lower recall.
yolov11m-face.pt YOLO Larger, higher recall.
scrfd SCRFD (insightface) SCRFD-10G — strong on small/tilted/profile faces.
scrfd-s SCRFD Lighter SCRFD-500M variant.

How a spec is resolved:

  • YOLO — the default backend. Community face weights (yolov{8,10,11,12}{n,s,m,l}-face.pt) aren't official Ultralytics assets, so nophi-av fetches them from the yolo-face release and caches them in Ultralytics' weights dir on first use. You can also pass a local checkpoint path or an official Ultralytics name (e.g. yolov8n.pt, which detects people, not faces). Requires ultralytics>=8.3 for the YOLOv12 architecture (included).
  • SCRFD — selected by --face-model scrfd (or scrfd-s). insightface downloads and caches the model pack on first use. Needs the optional extra: pip install 'nophi-av[scrfd]'.

For PHI de-identification, recall matters more than precision (a missed face is a leak; an over-masked box is harmless), so SCRFD is a good upgrade when small or non-frontal faces are being missed.

Adding your own backend: subclass FaceDetector in nophi_av/detectors.py, register it in _BACKENDS, and add a ModelInfo entry to FACE_MODELS so it appears in list-models. Nothing in video.py needs to change — it only calls detector.detect(frame, conf).

Usage

# Audio: bleep spoken PHI, write redacted audio + transcript + report
nophi-av scan interview.mp3 -o ./cleaned

# Audio, mute (instead of beep) PHI, larger Whisper model for accuracy
nophi-av scan interview.wav --redact-mode silence --whisper-model medium -o ./cleaned

# A whole folder; detect only names & dates; report without writing media
nophi-av scan ./recordings -e PERSON,DATE_TIME --dry-run

# Map known participants to stable IDs instead of <PERSON>
nophi-av scan interview.mp3 -m participants.csv -o ./cleaned

# Video: black-box faces + bleep spoken PHI (box is the default)
nophi-av scan visit.mp4 -o ./cleaned   # uses default yolov12s-face (auto-downloaded)

# Video, pixelate faces instead of a solid box
nophi-av scan visit.mp4 --mask-mode pixelate -o ./cleaned

# Video, use the SCRFD detector instead of YOLO (needs the [scrfd] extra)
nophi-av scan visit.mp4 --face-model scrfd -o ./cleaned

# Video, audio scrub + remux only (skip face masking)
nophi-av scan visit.mp4 --mask-mode none -o ./cleaned

# See the available face-detection models
nophi-av list-models

Bare invocation works too: nophi-av interview.mp3 is shorthand for nophi-av scan interview.mp3.

Example

Scanning a ~49s clinical-intake recording with --whisper-model base on CPU:

Entity type    Count   Examples (with timestamps)
-----------    -----   --------------------------
PERSON           2     James Carter (00:17.7), Emily Rivera (00:24.7)
DATE_TIME        1     October 24th, 1999 (00:33.8)
LOCATION/ORG     1     455 University Avenue (00:41.8)

cleaned/interview_redacted.mp3 (PHI windows beeped, duration preserved), cleaned/interview_transcript.txt ("My name is Dr. <PERSON> … I live at <ORGANIZATION>, apartment 3B."), and phi_av_report.xlsx.

Key options

Option Description
--redact-mode beep|silence How to remove PHI from audio (default beep)
--beep-hz Beep tone frequency (default 1000)
--mask-mode box|pixelate|blur|none How to mask faces in video (default box; none skips)
--face-model Face detector: a YOLO weight name/path or scrfd (see list-models)
--whisper-model WhisperX size: tiny/base/small/medium/large-v3
--device auto|cpu|cuda Compute device for ASR / face masking
--hf-token Enable speaker diarization (needs a HuggingFace token)
--padding Seconds added around each bleeped interval (default 0.15)
--entities, --mappings, --exclude, --dry-run Same semantics as nophi

Development

The transcript↔time mapping core (nophi_av/detect.py) is ML-free and unit tested without torch/whisperx:

pytest packages/nophi-av/tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nophi_av-0.1.0.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nophi_av-0.1.0-py3-none-any.whl (30.5 kB view details)

Uploaded Python 3

File details

Details for the file nophi_av-0.1.0.tar.gz.

File metadata

  • Download URL: nophi_av-0.1.0.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for nophi_av-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ea87400e279eb1c773c7405c7bce91df0231e2f44f68fc2633858218a8bb13cc
MD5 32396a67f600452b8ce7f611ef34eb14
BLAKE2b-256 632fd63281f25701e33d6095db020be5e69fb6729b1d592fe1053d0acea6766e

See more details on using hashes here.

File details

Details for the file nophi_av-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: nophi_av-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 30.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for nophi_av-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 15503dd5f9ef09dd22a25e8c2f82abd6dcbb4083af0a458dd9964971bb3e73e3
MD5 4757fc6e882f2f35639e1478159cb020
BLAKE2b-256 47e17526044353f0efa7647318c56ba1109cc99c9573643ad7453aecefbc26c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page