Skip to main content

AI tools for ALL

Project description

🦾 Sujet AI / E2D

⚡ Entity Encoder Decoder (E2D): A Privacy-Enhanced Framework for LLMs and RAG Agents ⚡

PyPI - PyPi MIT license

Quick Install

With pip:

pip install sujet-ai
python -m spacy download en_core_web_lg # For French, replace "en_core_web_lg" with "fr_core_news_lg"

What is E2D ?

With the implementation of the AI EU Act, privacy has become a paramount concern in the development and deployment of AI technologies. In response to these heightened privacy requirements, the E2D (Entity Encoder Decoder) framework emerges as a robust solution specifically designed to address these concerns. E2D is a privacy-enhanced framework tailored for Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) agents, offering a reliable mechanism for managing sensitive information and mitigating privacy risks. By using indexed-NER to filter entities, the E2D framework ensures that delicate information is safeguarded during both the training and inference phases of AI model operations. This approach is especially beneficial for closed-source LLMs, providing an additional layer of security to prevent unauthorized access or misuse of sensitive data.

Key Features

  • Privacy Preservation: E2D ensures that sensitive entities are encoded securely, reducing the risk of data breaches.
  • Compatibility: The framework can be integrated with various LLMs and RAG agents, making it versatile and adaptable.
  • Ease of Use: Designed with usability in mind, E2D can be easily incorporated into existing AI workflows.
  • Regulatory Compliance: By adopting E2D, organizations can align with the privacy requirements stipulated by the AI EU Act and other regulations.

Components

  1. Entity Encoder: This component encodes sensitive entities using indexed-NER, ensuring that the original data cannot be easily reconstructed without authorization.
  2. Entity Decoder: The decoder component allows for the secure retrieval of the original entities when necessary, ensuring that only authorized users can access the sensitive information.
  3. Integration Modules: E2D provides modules for seamless integration with popular LLM frameworks and RAG agents.

Examples

Below are examples of how to use the E2D framework with different applications.

Example 1: Process of E2D

from sujet_ai import EntityEncoderDecoder

e2d = EntityEncoderDecoder(model="en_core_web_lg") #For French, replace "en_core_web_lg" with "fr_core_news_lg"

text = "Apple Inc. was founded by Steve Jobs and Steve Wozniak on April 1, 1976."

# Encode the entities in the text using E2D
encoded_text = e2d.encode_entities(text)

print("The encoded text is: ", encoded_text)
# Output: "ORG_0 was founded by PERSON_0 and PERSON_1 on DATE_0."

# Decode the entities in the encoded text using E2D
decoded_text = e2d.decode_entities(encoded_text)

print("The decoded text is: ", decoded_text)
# Output: "Apple Inc. was founded by Steve Jobs and Steve Wozniak on April 1, 1976."

Example 2: Integration of E2D with LangChain

import os
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.llms import OpenAI
from sujet_ai import EntityEncoderDecoder

# Set the OpenAI API key
os.environ["OPENAI_API_KEY"] = "Insert Your API Key Here..."


# Define the question to be answered
question = ("What were the total operating expenses for Universal Holdings Ltd. in 2023 and how did they compare "
"to the previous year?")

# we assume the context is already retrieved.
# Open the example text file as a context containing "fake" sensetive information and read its contents
with open("example.txt", "r") as file:
    context = file.read()

# Create an instance of the EntityEncoderDecoder class
e2d = EntityEncoderDecoder(model="en_core_web_lg") #For French, replace "en_core_web_lg" with "fr_core_news_lg"

# Encode the entities in the context and question using the EntityEncoderDecoder instance
encoded_context = e2d.encode_entities(context)
encoded_question = e2d.encode_entities(question)

# Define the prompt template to be used by the language model
prompt = PromptTemplate(
template="Answer the question using the given context.\nQuestion: {question}\nContext: {context}\nAnswer:",
          input_variables=["question", "context"])

# Create an instance of the OpenAI language model
llm = OpenAI()

# Create an instance of the LLMChain class using the language model and prompt template
chain = LLMChain(llm=llm, prompt=prompt)

# Use the LLMChain instance to generate a response to the encoded question and context
encoded_response = chain.run({"question": encoded_question, "context": encoded_context})

# Decode the entities in the encoded response using the EntityEncoderDecoder instance
decoded_response = e2d.decode_entities(encoded_response)

# Use the LLMChain instance to generate a response to the original question and context
response = chain.run({"question": question, "context": context})

# Print the decoded response and the original response
print("The E2C response is: ", decoded_response)
print("The regular response is: ", response)

Bibliography

To cite the E2D framework, please use the following BibTeX reference:

@software{e2d2024,
  title={Entity Encoder Decoder: A Privacy-Enhanced Framework for LLMs and RAG Agents},
  author={Sujet AI, Hamed Rahimi},
  year={2024},
  url = {https://github.com/sujet-ai/E2D-Privacy-Enhanced-RAG}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sujet-ai-0.1.2.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

sujet_ai-0.1.2-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file sujet-ai-0.1.2.tar.gz.

File metadata

  • Download URL: sujet-ai-0.1.2.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for sujet-ai-0.1.2.tar.gz
Algorithm Hash digest
SHA256 5398c7f6d06d2610df9b8ca3c8d4eedcce1f76e6cb3db7b0de3e64bfdcd4395a
MD5 f3e21589e78680346d1b6e4cf8a0b2da
BLAKE2b-256 ed40b9be943e36aa3d4884931f5043c8e6d8e980c55841f9949bf5bc01612dfb

See more details on using hashes here.

File details

Details for the file sujet_ai-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: sujet_ai-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 5.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for sujet_ai-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b1e7065871d0f90dcecebda0b50c61ba0d952d24a5a23b3c0c8aa82e2f338cc7
MD5 b13bd6656875d2c64ca0e97a1582f2b3
BLAKE2b-256 fe1418eff4a0fe530a835bda97cf96014d05affc049275bd0f9b5507687debde

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page