Scan, redact, and manage PII in your documents before they get uploaded to a Retrieval Augmented Generation (RAG) system.
Project description
Open-source PII Detection for Retrieval Systems.
Scan, redact, and manage PII in your documents before they get uploaded to a Retrieval Augmented Generation (RAG) system.
Overview
DataFog works by scanning and redacting-out PII in files before are uploaded to a RAG system.
How it works
Installation
DataFog can be installed via pip:
pip install datafog # python client
Dev Notes
- Clone repo
- Run 'poetry install' to install dependencies (recommend entering poetry shell for preserving dependencies)
- Justfile commands:
just formatto apply formatting.just lintto check formatting and style.just tagto tag your project on gitjust uploadto publish to PyPi.
Testing
To run the datafog unit tests, check out this repository and do
tox
License
This software is published under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datafog-2.0.1.tar.gz.
File metadata
- Download URL: datafog-2.0.1.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e37b9b1c16863019d5c89068de552b3344fea8a0b1f0977b1b995fba1338e65
|
|
| MD5 |
de4e35b0baf5dc662ef9540678290a51
|
|
| BLAKE2b-256 |
057b8eaf1dde5664384e083a5ae0500d9899ed7e32c9dbc5dbad7b1a7c668107
|
File details
Details for the file datafog-2.0.1-py3-none-any.whl.
File metadata
- Download URL: datafog-2.0.1-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
450faa7ceec746c231973bb1e421ed509c2fd339c7cebe012c1fca0112dc44b8
|
|
| MD5 |
33246ded4cbddda62787ff469b16ea7a
|
|
| BLAKE2b-256 |
f6c93e5050e703ab837d046deea3a2d21233b987c9ec760e7c22eed250795b41
|