Skip to main content

A tool to anonymize data before sending to LLM providers, then re-identify the data after receiving the response.

Project description

PHI Anonymizer

PHI (Protected Health Information) Anonymizer is a tool designed to help you anonymize sensitive information in your text data. It uses advanced natural language processing techniques to identify and redact sensitive information such as names, addresses, and medical terms.

Usage

Install dependencies

pip install -r requirements.txt

Example usage

from __init__ import anonymize_text, deanonymize_text

original_text = "Hi! My name is Bobby Smith, I was born on 01/04/1970 in New York City. My phone number is 626-433-7890 and my email is bobby.smith@gmail.com. You probably need my social security number too, it's 213-45-6919. So what have you learned about me?"
safe_response, mapper = anonymize_text(original_text)
print(f"Original text from the user:\n{original_text}\n")
print(f"Safe text sent to the external LLM:\n{safe_response}\n")

# Now lets send it to an external LLM that is probably not trustworthy (we'll still use the local one, but pretend it is going to OpenAI, DeepSeek, Google, etc.)
response = mapper.llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": "The assistant is an expert as assisting the user and is getting to know them better. The assistant is very friendly and helpful.",
        },
        {
            "role": "user",
            "content": f"{safe_response}",
        },
    ],
    temperature=0.4,
    max_tokens=2048,
)

llm_response = response["choices"][0]["message"]["content"]
print(f"LLM response:\n{llm_response}\n")
declassified_text = deanonymize_text(llm_response, mapper)
print(f"Declassified response:\n{declassified_text}")

Replace the response section with using something like your OpenAI API or other LLM provider to anonymize the text before sending it to the LLM, and then deanonymize the response after receiving it.

Example output

Original text from the user:
Hi! My name is Bobby Smith, I was born on 01/04/1970 in New York City. My phone number is 626-433-7890 and my email is bobby.smith@gmail.com. You probably need my social security number too, it's 213-45-6919.

Safe text sent to the external LLM:
Hi! My name is John, I was born on 01/04/1970 in New York City. My phone number is 378-602-8003 and my email is bobby.smith@gmail.com. You probably need my social security number too, it's 123456789.

LLM response:
Hello John! Thank you for sharing your information. It's great that you've provided your contact details. If you have any questions or need assistance with something related to your information, feel free to ask. I'm here to help.

Declassified response:
Hello Bobby Smith! Thank you for sharing your information. It's great that you've provided your contact details. If you have any questions or need assistance with something related to your information, feel free to ask. I'm here to help.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phianonymizer-0.0.2.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phianonymizer-0.0.2-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file phianonymizer-0.0.2.tar.gz.

File metadata

  • Download URL: phianonymizer-0.0.2.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.9

File hashes

Hashes for phianonymizer-0.0.2.tar.gz
Algorithm Hash digest
SHA256 a099156c818e6542290ec7942e8ac44ce79d12995dcb5467fa6ae089f4c1f39a
MD5 b828afd6b69d2674d93c8c2b8bbe20bd
BLAKE2b-256 781def0f43be0e26be8ab7b4d5756f2c2256c8701c25874c233fe387c89aecab

See more details on using hashes here.

File details

Details for the file phianonymizer-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: phianonymizer-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.9

File hashes

Hashes for phianonymizer-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d5fce0163504ba66ea1167d58ff8894d1b88c5c85bde488ff484a61eed354669
MD5 b3ddfb67e9d2598f9a64cd6659647a9c
BLAKE2b-256 241994c1239eb955bbb4f4d7b37b1d6bc94983ec042bd34a3c36d31b148328eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page