Skip to main content

A Python Machine Learning Security Toolbox for Adversarial Attacks.

Project description


Plexiglass

Wondering if your AI model is safe enough to use? Plexiglass is your sparring partner to bolster your model's defenses!

PyPI version license MIT

What is Plexiglass?

Plexiglass is a Python toolbox which supports pentesting against adversarial attacks in machine learning. It has two modules: LLMs and DNNs. For LLMs, plexiglass uses litellm under the hood.

We are working tirelessly to include more frameworks and attack/ defense mechanisms for testing. Please read our docs for the latest updates.

[!IMPORTANT] We are looking for contributors! Fork the repo to get started. Contribution guide is coming soon.

[!NOTE] Plexiglass is open-source: Please leave a star to support the project! ⭐

What is Adversarial Machine Learning?

Adversarial machine learning involves manipulating input data to deceive machine learning models. In deep neural networks (DNNs) and large language models (LLMs), attacks include adding subtly modified inputs that cause incorrect model predictions or responses. These attacks exploit model vulnerabilities, testing their robustness and security.

Installation

The first experimental release is version 0.0.1.

To download the package from PyPi:

pip install --upgrade plexiglass

Getting Started

LLM Module: Simple Usage

We support a variety of LLMs using litellm. Alternatively, you can also test your own huggingface models.

from plexiglass.LLM.evaluate import evaluate
from plexiglass.model import Model

import os

os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
model = Model("openai", "gpt-3.5-turbo")

evaluate(model, metrics=["toxicity"], attacks=["prompt_injection", "gcg"])

DNN Module: Simple Usage

Here is an example on how to test a model's robustness on FGSM attacks:

from plexiglass.DNN.evaluate import evaluate
from plexiglass.model import Model

import os

# load your pytorch model here

model = Model("torch", your-model)

evaluate(model, metrics=["accuracy"], attacks=["fgsm"])

Feature Request

To request new features, please submit an issue

Local Development

To get started

make develop

this will clean, build, and install the package locally for development purpose.

Contributors

Made with contrib.rocks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plexiglass-0.0.1.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

plexiglass-0.0.1-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file plexiglass-0.0.1.tar.gz.

File metadata

  • Download URL: plexiglass-0.0.1.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.28.2 setuptools/57.4.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.12

File hashes

Hashes for plexiglass-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2df0d66f6ea2f41be693142ad5986eb6580c9d5e41485b195968898a462ece55
MD5 c3fc4877138d5980a2f709748b1d3be1
BLAKE2b-256 7ecb0ceb94f92fba8ef44f24c089b09553de48d4bddf192eb7bb496bd44551f0

See more details on using hashes here.

File details

Details for the file plexiglass-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: plexiglass-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.28.2 setuptools/57.4.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.12

File hashes

Hashes for plexiglass-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f8b709b49cd808f497a0ce3f169f4f73b782a64f0d5ce398de87a2755b9b61a6
MD5 b5a3b66f5b19f6b9cf19810b6d1d76e2
BLAKE2b-256 b3dcdb772e022409f01056f3c86a48faf2817aed8869de50f1f493b2993c4ca4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page