Skip to main content

Shields your confidential data from third party LLM providers

Project description

llmshield

Overview

llmshield is a lightweight and dependency-free Python library designed for high-performance cloaking and uncloaking of sensitive information in prompts and responses from Large Language Models (LLMs). It provides robust entity detection and protection where data privacy and security are paramount.

The aim is to be extremely accurate, using a combination of list-based, rule-based, pattern-based, and probabilistic approaches.

Key Features

  • 🔒 Secure Entity Detection: Identifies and protects sensitive information including:
    • Proper nouns (Persons, Places, Organisations, Concepts)
    • Locators (Email addresses, URLs)
    • Numbers (Phone numbers, Credit card numbers)

Additional PII types are in development.

  • 🚀 High Performance: Optimised for minimal latency in LLM interactions
  • 🔌 Zero Dependencies: Pure Pythonic implementation with no external requirements
  • 🛡️ End-to-End Protection: Cloaks and uncloaks both prompts and responses
  • 🎯 Flexible Integration: Works directly with your existing LLM function.

Installation

pip install llmshield

Quick Start

from llmshield import LLMShield

# Basic usage - Manual LLM integration
shield = LLMShield()

# Cloak sensitive information
cloaked_prompt, entity_map = shield.cloak("Hi, I'm John Doe (john.doe@example.com)")
print(cloaked_prompt)  # "Hi, I'm <PERSON_0> (<EMAIL_0>)"

# Send to your LLM...
llm_response = your_llm_function(cloaked_prompt)

# Uncloak the response
original_response = shield.uncloak(llm_response, entity_map)

# Direct LLM integration
def my_llm_function(prompt: str) -> str:
    # Your LLM API call here
    return response

shield = LLMShield(llm_func=my_llm_function)
response = shield.ask(prompt="Hi, I'm John Doe (john.doe@example.com)")

Configuration

Delimiters

You can customise the delimiters used to wrap protected entities:

shield = LLMShield(
    start_delimiter='[[',  # Default: '<'
    end_delimiter=']]'     # Default: '>'
)

The choice of delimiters should align with your LLM provider's training. Different providers may perform better with different delimiter styles.

LLM Function Integration

Provide your LLM function during initialization for streamlined usage:

shield = LLMShield(llm_func=your_llm_function)

Best Practices

  1. Consistent Delimiters: Use the same delimiters across your entire application
  2. Error Handling: Always handle potential ValueError exceptions
  3. Entity Mapping: Store entity maps securely if needed for later uncloaking
  4. Input Validation: Ensure prompts are well-formed and grammatically correct

Requirements

  • Python 3.10+
  • No additional dependencies
  • Officially supports English and Spanish texts only.
  • May work with other languages with lower accuracy and potential PII leakage.

Support

Contributing

Contributions are welcome! Please follow these guidelines:

  1. Recommended IDE Development Packages:

    • Black
    • Isort
    • Markdownlint
  2. Getting Started:

    a. Ensure you have Python 3.10+ installed on your system

    b. Create a virtual environment with Python 3.10+

    python -m venv venv
    source venv/bin/activate
    

    c. Install the package in development mode with all development dependencies:

    make dev-dependencies
    
  3. Code Quality and Formatting Guidelines:

    • Follow black and isort rules
    • Add tests for new features
    • Do not break existing tests (unless justifying the change)
    • Maintain zero (non-development) dependencies (non-negotiable)
    • Use British English in all naming and documentation
  4. Testing:

    make tests
    
    • Run coverage:
    make coverage
    
  5. Documentation:

    • Update docstrings
    • Keep README.md current
    • Add examples for new features
  6. Build and publish

    make build # Building the package
    python -m twine upload dist/*
    

Note: You will need to have a PyPI account and be authenticated.

License

GNU APGLv3 License - See LICENSE.txt file for details

Notable Uses

llmshield is currently used by:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmshield-0.0.6.tar.gz (392.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmshield-0.0.6-py3-none-any.whl (382.4 kB view details)

Uploaded Python 3

File details

Details for the file llmshield-0.0.6.tar.gz.

File metadata

  • Download URL: llmshield-0.0.6.tar.gz
  • Upload date:
  • Size: 392.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for llmshield-0.0.6.tar.gz
Algorithm Hash digest
SHA256 8ddf179d48549d1d37c665e6bdec393731482ad06b36cad6ab93396f5d8683a1
MD5 03b1779eec11ed2e572ff702286ae9c0
BLAKE2b-256 db8619a997969b8acf83a9785cf80913a547555abe7f80ca40ab78f356e9f32b

See more details on using hashes here.

File details

Details for the file llmshield-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: llmshield-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 382.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for llmshield-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 1a3384d487ce74688a7c440fbdd49e90d92c2cebe44cca6fc425cb7ae41422df
MD5 c87ebda6afab7ec30cde0e61583ad9fe
BLAKE2b-256 53e2c2122ba97824aa30465ab6e98b5a8157f2d1ea69ae91b7a253d540ce5411

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page