A library for scanning Personally Identifiable Information (PII).
Project description
PII Scanner
A Python library designed for text processing using SpaCy and custom regex pattern matching. This library is capable of processing a variety of text data formats, such as lists, plain text, PDFs, JSON, CSV, and XLSX files
Installation
pip install pii_scanner
Usage
import asyncio
from pii_scanner.scanner import PIIScanner
from pii_scanner.constants.patterns_countries import Regions
async def run_scan():
# Start the timer
start_time = time.time()
pii_scanner = PIIScanner()
# file_path = 'dummy-pii/test.json'
file_path = 'dummy-pii/test.xlsx'
data = ['Ankit Gupta', '+919140562125', 'Indian']
results_list_data = await pii_scanner.scan(data=, sample_size=0.005, region=Regions.IN)
# results_file_data = await pii_scanner.scan(file_path=file_path, sample_size=0.005, region=Regions.IN)
print("Results:", results_list_data, results_list_data)
# Run the asynchronous scan
asyncio.run(run_scan())
Output
[
{
"text": "Ankit Gupta",
"entity_detected": [
{"type": "PERSON", "start": 0, "end": 11, "score": 0.85}
]
},
{
"text": "+919140562195",
"entity_detected": [
{"type": "PHONE_NUMBER", "start": 0, "end": 13, "score": 0.85}
]
},
{
"text": "Indian",
"entity_detected": [
{"type": "NATIONALITY", "start": 0, "end": 6, "score": 0.9}
]
}
]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pii_scanner-0.1.22.tar.gz
(162.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
pii_Scanner-0.1.22-py3-none-any.whl
(173.2 kB
view details)
File details
Details for the file pii_scanner-0.1.22.tar.gz.
File metadata
- Download URL: pii_scanner-0.1.22.tar.gz
- Upload date:
- Size: 162.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
113e569b023b4f2e3131aa17978433b180a533eba853e9fdddafa10d91f05800
|
|
| MD5 |
288928ca5183918527686cf2206da1ad
|
|
| BLAKE2b-256 |
3acd0ed1bcd54483940057e28e0c696aa2e6f8012225902a0ceded45077ab4d2
|
File details
Details for the file pii_Scanner-0.1.22-py3-none-any.whl.
File metadata
- Download URL: pii_Scanner-0.1.22-py3-none-any.whl
- Upload date:
- Size: 173.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d2f441e9b4c1c0eb0d083cf04e539db5bd7d09255eb2d68e5ea0c83dc1258b9
|
|
| MD5 |
73e76f4544dc2a25fdc82b7c60b3bea6
|
|
| BLAKE2b-256 |
e303b0bf4dc3f66b9f59f1d63a86940fdbefa89671f80b9f5c328cab764e7d60
|