Local web review interface for nophi / nophi-av PHI redaction
Project description
nophi-ui
A local web review interface for the nophi (documents) and
nophi-av (audio/video) PII/PHI redaction engines.
It lets you run detection locally from the browser, remove false-positive detections (and, for audio, add a missed segment by hand) before redaction is applied, and view the redacted result.
Run
nophi-ui # opens http://127.0.0.1:8000
nophi-ui --port 9000 --no-open
You select a server-side input directory and output directory (raw paths), preview the files that will be processed, then start detection.
Prerequisites
-
Python 3.10+ — 3.12 recommended. 3.12 is what the app is tested against; very new releases (e.g. 3.14) may not work.
-
pip install nophi-uipulls in everything else automatically: the document and audio/video engines (nophi,nophi-av), FastAPI, and the ML stack. No separate installs are needed and no API keys are required. -
Models are downloaded on first use and are cached
To fetch them ahead of time instead of on the first run:
nophi download-models # document NLP models nophi-av download-models # audio/video models
Usage
The interface is a single page with two phases.
1. Setup. Type or Browse… to an input and output folder, pick the options below, then Preview files to confirm what will be processed and Start detection to run:
-
Audio redaction — how PHI is removed from audio:
beep(overlay a tone) orsilence(mute the span). -
Whisper model — the speech-to-text model used to transcribe audio/video:
tiny— fastest, least accuratebase— middle groundsmall(default) — most accurate of the three, slowest
Documents don't use Whisper — their detection runs automatically with spaCy + biomedical NER, nothing to choose.
2. Review & apply. When detection finishes, open each file to see its detections, remove any false positives (and, for audio, add missed segments), then Apply to write the redacted result. Apply is re-runnable — toggle detections and re-apply until you're satisfied; nothing is final until you stop the server. See What it does below for the per-format specifics. Redacted files and an Excel report are written to your output folder.
What it does
- Documents (
.txt .csv .docx .xlsx .pdf): detect → review the detection list → uncheck false positives → apply. PDF previews inline; docx/xlsx are download-only. - Audio: detect → review (play the original clip per detection) → uncheck
false positives and/or add missed
start/endsegments → apply (re-scrubs from the original; no re-transcription). - Video: view-only. Redacted in one shot; detections shown for reference. Video redaction is currently still in development.
PDF redaction-box labels
In redacted PDFs, each box is stamped with a short code instead of the full
entity name (full names like <ORGANIZATION> don't fit short spans such as
"LLC"), so every entity type is reduced to a 2-letter code rendered as <XX>:
| Code | Entity type | Code | Entity type |
|---|---|---|---|
PR |
PERSON | SS |
US_SSN |
OR |
ORGANIZATION | BK |
US_BANK_NUMBER |
LO |
LOCATION | DL |
US_DRIVER_LICENSE |
DT |
DATE_TIME | PP |
US_PASSPORT |
PH |
PHONE_NUMBER | IT |
US_ITIN |
EM |
EMAIL_ADDRESS | ML |
MEDICAL_LICENSE |
CC |
CREDIT_CARD | IB |
IBAN_CODE |
IP |
IP_ADDRESS | NR |
NRP |
UR |
URL |
The review table in the output report always shows the full entity type; the abbreviations appear only inside the PDF boxes.
Security
This tool serves PII/PHI, so by design it:
- binds locally to
127.0.0.1only (refuses other hosts), - locks CORS to its own origin,
- requires a per-launch token on every API call,
- serves files by opaque job/file id (never a client-supplied path),
- marks PHI responses
Cache-Control: no-storeand serves only the clipped segment for audio review.
State is held in memory for the process lifetime; closing the server clears it (a restart means re-running detection).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nophi_ui-0.1.1.tar.gz.
File metadata
- Download URL: nophi_ui-0.1.1.tar.gz
- Upload date:
- Size: 24.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e86280ee159ff2360b1cd0ec228be6aee95ec9cf83db79ae1c5cc9d934b065e7
|
|
| MD5 |
fefc921a7ba90b95d7d5d56665293ac3
|
|
| BLAKE2b-256 |
2e216098625c02078c94591eb1ffe5ea0f200d7c2f0c663cc12bfb3c0b142ef4
|
File details
Details for the file nophi_ui-0.1.1-py3-none-any.whl.
File metadata
- Download URL: nophi_ui-0.1.1-py3-none-any.whl
- Upload date:
- Size: 25.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9716e2fa07c521d0135401829a11a24aeade4a49ba6b0cdaa521ee65f2a75659
|
|
| MD5 |
3a6de201a3310c91c287b9e80efae5dd
|
|
| BLAKE2b-256 |
1a1fd0162f5c6abf9935a79eb0c44f9cffa7d246e9259be6f50b5534a42ae60c
|