Skip to main content

A library for scanning Personally Identifiable Information (PII).

Project description

PII Scanner

A Python library designed for text processing using SpaCy and custom regex pattern matching. This library is capable of processing a variety of text data formats, such as lists, plain text, PDFs, JSON, CSV, and XLSX files

Installation

pip install pii_scanner

Usage

import asyncio
from pii_scanner.scanner import PIIScanner
from pii_scanner.constants.patterns_countries import Regions

async def run_scan():
    # Start the timer
    start_time = time.time()

    pii_scanner = PIIScanner()
    # file_path = 'dummy-pii/test.json' 
    file_path = 'dummy-pii/test.xlsx' 

    data = ['Ankit Gupta', '+919140562125', 'Indian']
    results_list_data = await pii_scanner.scan(data=, sample_size=0.005, region=Regions.IN)
    # results_file_data = await pii_scanner.scan(file_path=file_path, sample_size=0.005, region=Regions.IN)

    print("Results:", results_list_data, results_list_data)

# Run the asynchronous scan
asyncio.run(run_scan())

Output

[
    {
        "text": "Ankit Gupta",
        "entity_detected": [
            {"type": "PERSON", "start": 0, "end": 11, "score": 0.85}
        ]
    },
    {
        "text": "+919140562195",
        "entity_detected": [
            {"type": "PHONE_NUMBER", "start": 0, "end": 13, "score": 0.85}
        ]
    },
    {
        "text": "Indian",
        "entity_detected": [
            {"type": "NATIONALITY", "start": 0, "end": 6, "score": 0.9}
        ]
    }
]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pii_scanner-0.1.16.tar.gz (160.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pii_Scanner-0.1.16-py3-none-any.whl (171.3 kB view details)

Uploaded Python 3

File details

Details for the file pii_scanner-0.1.16.tar.gz.

File metadata

  • Download URL: pii_scanner-0.1.16.tar.gz
  • Upload date:
  • Size: 160.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for pii_scanner-0.1.16.tar.gz
Algorithm Hash digest
SHA256 0485cbd656529319548d7bcaf5b0aa0bcd5235f00ebac8d8b9c69b1c68a501d8
MD5 d0efb29d99b9dc29af99bb4eb0862418
BLAKE2b-256 0193373148623dc17aac5258fc1f111c3eb8a6414558d58f8d64101398530158

See more details on using hashes here.

File details

Details for the file pii_Scanner-0.1.16-py3-none-any.whl.

File metadata

  • Download URL: pii_Scanner-0.1.16-py3-none-any.whl
  • Upload date:
  • Size: 171.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for pii_Scanner-0.1.16-py3-none-any.whl
Algorithm Hash digest
SHA256 fe2bd870f025fb0518e86ebac46af0d66cb8a3aa93af931818d482da4270d4f8
MD5 41525ac9d16871a1a3a8e658e7e3de77
BLAKE2b-256 564eaef5d4c6bc87d93b6b27c3b2e909c5443ce1d17ca22b8a5485ce30570221

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page