Skip to main content

A Python package to watermark text.

Project description

CanaryTrap

CanaryTrap is a Python package that uses the concept of a "canary trap" to watermark text using unicode spaces. By converting regular spaces to unicode spaces according to a binary pattern, you can add a hidden signature to your text.

Installation

You can install CanaryTrap using pip:

pip install canarytrap

How to use

First, import the package:

import canarytrap as canary

To watermark your text, use the unicode_space_encode function. You can either provide a binary string or let the function generate a random one:

text = "Hello world this is a test"
watermarked_text, binary_str = canary.unicode_space_encode(text)
print(watermarked_text)  # Watermarked text
print(binary_str)  # Binary string used for watermarking

You can convert the watermarked text back to binary using the unicode_space_to_binary_str function:

binary_str_back = canary.unicode_space_to_binary_str(watermarked_text)
print(binary_str_back)  # Should be the same as binary_str

To check if a text matches a given binary string, use the unicode_space_match function:

match = canary.unicode_space_match(watermarked_text, binary_str)
print(match)  # Should be 1.0 if text is not modified

Example Scenario

You have someone who is leaking private documents and want to find out who. You're about to send an email to a bunch of different employees. Let's use CanaryTrap to find out who it is.

First, let's encode our email

import canarytrap as canary

staff_email = "Hello staff, this is a test email. Please do not share this email with anyone else."

watermarked_email, binary_str = canary.unicode_space_encode(staff_email)

with open("email.txt", "w", encoding="utf-8") as f:
    f.write(watermarked_email)

with open("binary.txt", "w", encoding="utf-8") as f:
    f.write(binary_str)

The unicode characters will not appear (Is an empty character). Try copying that text into a code editor to see it

Hello‎ staff,‎ this is a test email.‎ Please do‎ not share‎ this‎ email‎ with anyone‎ else.

You can then check some text against the watermark:

import canarytrap as canary

watermarked_text = "Hello‎ staff,‎ this is a test email.‎ Please do‎ not share‎ this‎ email‎ with anyone‎ else."

match = canary.text.unicode_space_match(watermarked_text, binary_str)
print(match)  # Should be 1.0 if text is not modified

Note

If the watermarked text is edited (spaces are added, removed, or replaced), the match percentage returned by the unicode_space_match function may be less than 1.0. If you get a match percentage of less than 1.0, it means that the text was likely edited after being watermarked. The match percentage gives an estimate of how much of the original watermark remains in the text.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canarytrap-1.0.2.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

canarytrap-1.0.2-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

File details

Details for the file canarytrap-1.0.2.tar.gz.

File metadata

  • Download URL: canarytrap-1.0.2.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for canarytrap-1.0.2.tar.gz
Algorithm Hash digest
SHA256 f57d4dc54a48469fd730573daaa00b1cee65e67ea866eb0e568ef629209a857b
MD5 e174d8c2a7733c6aeefc95cb16dcba2c
BLAKE2b-256 9c5465115452aad26c68efe9160607fe72819bc84ab09e60b2dadc55c31b17b4

See more details on using hashes here.

File details

Details for the file canarytrap-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: canarytrap-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 5.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for canarytrap-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8f226aee3c2eb5962d47ac96a66e30ff57e7d9dd1f297dc3f11ead18ebac1761
MD5 9e65dfaf0c0baaf601df8a10b6a511fb
BLAKE2b-256 740bdc02f18499a34b952e945e20218ecc76d84641699e9ab46f014453c640bf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page