Skip to main content

Data Protecting Package

Project description

logo

BC-EnDeCoder

BC-EnDeCoder is a Python library that provides a secure way to encode and decode data for use with Large Language Models (LLM). The library allows you to protect sensitive information by passing a fake dummy value, which is then encoded and decoded to and from its original form after receiving a response from the LLM.

Features

  • Secure Encoding and Decoding: Protect your sensitive data by encoding it with a fake dummy value and decoding it back to the original form after interacting with an LLM.

  • Easy Integration: Simple and easy-to-use functions for encoding and decoding data, making it convenient to integrate into your projects.

  • Customizable Encoding Parameters: Fine-tune the encoding process with customizable parameters to suit your specific use case.

Installation

To install BC-EnDeCoder, you can use the following pip command:

pip install bc-en-de-coder 

How it Works

BC-EnDeCoder facilitates a secure interaction with LLMs through a three-step process:

  • Encoding with a Dummy Value: Sensitive data is encoded using a fake value, providing an added layer of security during transmission to an LLM.

  • Interaction with LLM: The encoded data is then passed to the LLM for analysis or processing.

  • Decoding the Response: Upon receiving the LLM's response, BC-EnDeCoder decodes it, revealing the original information without compromising its security.

Encoding and Decoding values in string

Encode and decode values in string using the encode_str() and decode_str() methods.

from bc_endecoder.replacing import BaseCoder

bc = BaseCoder()

text = '''
        This is a dummy text with value 200,100,150,250.
        We need to protect these values.
        '''

encoded_text,encodings = bc.encode_str(text)  #encode_str takes 1 paramter which is the text and returns the encoded text and encoding
print("Encoded Text : \n",encoded_text)

## encoded_text can be passed to GPT and after getting back the response it will be decoded using decode_str() method

original_text = bc.decode_str(encoded_text,encodings)  #decode_str takes 2 parameters which are the encoded_text and encoding and returns the original text
print("\nOriginal Text : \n",original_text)

Output

Encoded Text : 
 
        This is a dummy text with value 4858416350,7636580946,0858875814,8301435677.
        We need to protect these values.
        

Original Text : 
 
        This is a dummy text with value 200,100,150,250.
        We need to protect these values.

Encoding and Decoding values in Dataframe

Encode and decode values in Dataframe using the encode_df() and decode_df() methods.

from bc_endecoder.replacing import BaseCoder
import pandas as pd
import numpy as np

bc = BaseCoder()

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
    'Age': [25, 30, 22, 35, 28],
    'City': ['New York', 'San Francisco', 'Los Angeles', 'Chicago', 'Miami'],
    'Salary': [60000, 80000, 55000, 90000, 70000]
}
df = pd.DataFrame(data)


encoded_df,encoding = bc.encode_df(df)  #encode_df takes 1 paramter which is Dataframe and returns the encoded dataframe and encodings
print("Encoded_df : \n",encoded_df)

## encoded_df can be passed to GPT and after getting back the response it will be decoded using decode_df() method

original_df = bc.decode_df(encoded_df,encoding)  #decode_str takes 2 parameters which are the encoded_df and encoding and returns the original df
print("\nOriginal_df : \n",original_df)

Output

Encoded_df :    
              Name  Age                  City  Salary
        0    Alice  8623770624       New York  0197705789
        1      Bob  5223314994  San Francisco  9743912420
        2  Charlie  1795473060    Los Angeles  8982145407
        3    David  6439787181        Chicago  6618233087
        4      Eva  4699492207          Miami  6680877680

Original_df : 
             Name   Age          City  Salary
        0    Alice  25       New York  60000
        1      Bob  30  San Francisco  80000
        2  Charlie  22    Los Angeles  55000
        3    David  35        Chicago  90000
        4      Eva  28          Miami  70000

Encoding and Decoding values with a ratio in Dataframe, Json or String

Encode and decode values with a ratio in Dataframe, Json or String using the encode_df_ratio() and decode_df_ratio() methods.

from bc_endecoder.replacing import BaseCoder
import pandas as pd
import numpy as np

bc = BaseCoder()

json_data = {
  "key1": 10,
  "key2": 20,
  "key3": "Hello",
  "key4": 3.14,
  "key5": [1, 2, 3],
  "key6": {"nested_key": "nested_value"},
  "key8": "2022-01-01",
  "key9": None,
  "key10": {"sub_key1": 5, "sub_key2": "world"},
  "key11": [4.5, 6.7, 8.9],
  "key12": False,
  "key13": "42",
  "key14": ["apple", "banana", "cherry"],
  "key15": {"nested_key2": [1, 2, 3]},
  "key16": 7.77,
  "key17": "test",
  "key18": {"sub_key3": "value3", "sub_key4": 10},
  "key19": [True, False],
  "key20": 12345
}

ratio = 56 #this is the ratio for which we want to encode the data, it can be any number except 0 and 1

encoded_data = bc.encode_in_ratio(json_data,ratio)  #encode_in_ratio takes 2 paramter which is Data and the ratio number, and returns the encoded data
print("Encoded data : \n", encoded_data)

## encoded_data can be passed to GPT and after getting back the response it will be decoded using decode_df() method

original_data = bc.decode_in_ratio(encoded_data,ratio)  #decode_str takes 2 parameters which are the encoded_data and encoding and returns the original json
print("Original data : \n",original_data)

Output

Encoded data : 
 {'key1': 560, 'key2': 1120, 'key3': 'Hello', 'key4': 175.84, 'key5': [56, 112, 168], 'key6': {'nested_key': 'nested_value'}, 'key8': '2022-01-01', 'key9': None, 'key10': {'sub_key1': 280, 'sub_key2': 'world'}, 'key11': [252.0, 375.2, 498.40000000000003], 'key12': 0, 'key13': '42', 'key14': ['apple', 'banana', 'cherry'], 'key15': {'nested_key2': [56, 112, 168]}, 'key16': 435.12, 'key17': 'test', 'key18': {'sub_key3': 'value3', 'sub_key4': 560}, 'key19': [56, 0], 'key20': 691320}

Original data : 
 {'key1': 10.0, 'key2': 20.0, 'key3': 'Hello', 'key4': 3.14, 'key5': [1.0, 2.0, 3.0], 'key6': {'nested_key': 'nested_value'}, 'key8': '2022-01-01', 'key9': None, 'key10': {'sub_key1': 5.0, 'sub_key2': 'world'}, 'key11': [4.5, 6.7, 8.9], 'key12': 0.0, 'key13': '42', 'key14': ['apple', 'banana', 'cherry'], 'key15': {'nested_key2': [1.0, 2.0, 3.0]}, 'key16': 7.7700000000000005, 'key17': 'test', 'key18': {'sub_key3': 'value3', 'sub_key4': 10.0}, 'key19': [1.0, 0.0], 'key20': 12345.0}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bc-en-de-coder-0.0.18.tar.gz (5.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page