Skip to main content

Modify large text file line by line and encrypt with GnuPG.

Project description

gpgodzilla - Large File Encryption with Line Modification

gpgodzilla enables developers and data scientists to encrypt and decrypt large and structured files while modifying them using whatever custom functions/methods they want, in memory. It is specifically designed for de-tokenization + encryption of sensitive data such as account numbers and social insurance numbers, where raw data is not allowed to live on the local storage of the system.

Use Case

The files that can be worked with should be structured on a line-by-line basis. For example, each line has some portions that need to be modified in the same way before encryption or after decryption.

One primary example is for transferring and processing PANs. Provided is a file of customers' PANs that are tokenized. The requirement is to send the encrypted and de-tokenized PANs to the receiver.

However, because PANs are highly sensitive, the de-tokenized/raw PANs cannot touch the local storage of the system. Hence, de-tokenization and encryption need to happen in system memory.

Requirements

It is essential to have GnuPG 2 installed on the system.

Quick Start

Install via pip:

pip install gpgodzilla

Basic Example

With the file to manipulate on the local storage, define the path to the file and the path to the processed/manipulated file. The file to process must exist.

Define the recipient of the encrypted file and the manipulation method for each line (ex. de-tokenization method that returns the de-tokenized line).

from gpgodzilla import encrypt_large_file, decrypt_large_file

def tokenize_foo(line):
    # The example tokenization
    # replacing each "foo" with "bar" before encryption
    line = line.replace('foo', 'bar')
    return line

def detokenize_bar(line):
    # The example detokenization method
    # replacing each "some_token" with "cipher" before encryption
    line = line.replace('bar', 'foo')
    return line

# The following code demonstrates a simple use case

file_to_encrypt = 'test.txt'  # File to manipulate and encrypt
recipient = 'john.doe@test.com'  # Email of the recipient (GnuPG), which must exist on the system on which the code is running
output_file_encrypt = 'test.pgp'   # path of the encrypted & manipulated file
output_file_decrypt = 'original_test.txt' # path of the decrypted file
encrypt_large_file(recipient, file_to_encrypt, output_file_encrypt, tokenize_foo)

# then, to decrypt the file and detokenize each 'bar' back to 'foo':
decrypt_large_file(output_file_encrypt, output_file_decrypt, detokenize_bar, PASSPHRASE)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpgodzilla-0.0.1.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gpgodzilla-0.0.1-py3-none-any.whl (4.3 kB view details)

Uploaded Python 3

File details

Details for the file gpgodzilla-0.0.1.tar.gz.

File metadata

  • Download URL: gpgodzilla-0.0.1.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for gpgodzilla-0.0.1.tar.gz
Algorithm Hash digest
SHA256 0e747cbdf02a99aba0ad6c6f644543bc4c6a170e2f71556b6e15b27b259660df
MD5 d29481b2cbd2803c8e815c1470d1c422
BLAKE2b-256 23664c1eedda9036c440bb58109a68e343bd8baac53a51e6d3c6a4ea2842cc28

See more details on using hashes here.

File details

Details for the file gpgodzilla-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: gpgodzilla-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 4.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for gpgodzilla-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 77fc0dbf8804c515a6b78ec16b44a22c568ed4b59aa026b4616ec809fb440f10
MD5 3868677799aabfaa95364e4daec184af
BLAKE2b-256 ce78f41d0a33c45735aac7f8bbae06bc7e63ffeff5695a81d072ed4b101a62b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page