A Python Machine Learning Security Toolbox for Adversarial Attacks.
Project description
Plexiglass
Wondering if your AI model is safe enough to use? Plexiglass is your sparring partner to bolster your model's defenses!
What is Plexiglass?
Plexiglass is a Python toolbox which supports pentesting against adversarial attacks in machine learning. It has two modules: LLMs and DNNs. For LLMs, plexiglass uses litellm under the hood.
We are working tirelessly to include more frameworks and attack/ defense mechanisms for testing. Please read our docs for the latest updates.
[!IMPORTANT] We are looking for contributors! Fork the repo to get started. Contribution guide is coming soon.
[!NOTE] Plexiglass is open-source: Please leave a star to support the project! ⭐
What is Adversarial Machine Learning?
Adversarial machine learning involves manipulating input data to deceive machine learning models. In deep neural networks (DNNs) and large language models (LLMs), attacks include adding subtly modified inputs that cause incorrect model predictions or responses. These attacks exploit model vulnerabilities, testing their robustness and security.
Installation
The first experimental release is version 0.0.1
.
To download the package from PyPi:
pip install --upgrade plexiglass
Getting Started
LLM Module: Simple Usage
We support a variety of LLMs using litellm
. Alternatively, you can also test your own huggingface
models.
from plexiglass.LLM.evaluate import evaluate
from plexiglass.model import Model
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
model = Model("openai", "gpt-3.5-turbo")
evaluate(model, metrics=["toxicity"], attacks=["prompt_injection", "gcg"])
DNN Module: Simple Usage
Here is an example on how to test a model's robustness on FGSM attacks:
from plexiglass.DNN.evaluate import evaluate
from plexiglass.model import Model
import os
# load your pytorch model here
model = Model("torch", your-model)
evaluate(model, metrics=["accuracy"], attacks=["fgsm"])
Feature Request
To request new features, please submit an issue
Local Development
To get started
make develop
this will clean, build, and install the package locally for development purpose.
Contributors
Made with contrib.rocks.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for plexiglass-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f8b709b49cd808f497a0ce3f169f4f73b782a64f0d5ce398de87a2755b9b61a6 |
|
MD5 | b5a3b66f5b19f6b9cf19810b6d1d76e2 |
|
BLAKE2b-256 | b3dcdb772e022409f01056f3c86a48faf2817aed8869de50f1f493b2993c4ca4 |