Skip to main content

A system to protect AI from misuse and harmful inputs.

Project description

AI Guard (Beta)

AI Guard is designed to protect AI Systems from misuse, manipulation, and harmful inputs. It ensures that the AI behaves as intended and does not deviate from its guidelines or generate harmful content.

Features

  • Input Sanitization
  • Prompt Classification
  • Manipulation Detection
  • Ethical Compliance Check
  • Response Validation
  • Logging and Monitoring
  • Customizable Settings
  • Interactive Testing Mode (Uses Ollama)
  • Fallback Mechanisms
  • Scalability and Extensability

How It Works

AI Guard protects AI Systems from misuse and harmful inputs by sanitizing user prompts to remove noise, classifying them as "non-toxic" (safe) or "toxic" (malicious) using a machine learning model and detecting manipulation attempts through predefined rules. It ensures ethical compliance by blocking prompts that request harmful or illegal actions and validates AI responses to prevent inappropriate content. The system logs all activities for auditing, operates interactively for testing, and includes a fallback mechanism to handle ambiguous cases by defaulting to "non-toxic". You can customize this however you'd like via the settings.yml file it supports features like input sanitization, prompt classification, and response validation making it ideal for chatbots, ai assistants, content moderation and research!

Settings and Manipulation Rules

If you find any form of manipulation rule that can manipulate the AI assistant in any form of way you can add that said prompt to the manipulation_rules.txt file and you can open an issue stating a bypass you found. There are currently 583 different manipulation rules as of the first beta release and we're always looking to add more to this list that can really keep all AI safe from unrightful prompt engineering.

If you wish to tinker around with it and enable or disable whatever you dont like with in the whole system, you can head to the settings.yml file and enable or disable things. You can even go ahead and change the logs file to a whole different file if you wish.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiguard-0.1.1.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

AIGuard-0.1.1-py3-none-any.whl (3.2 kB view details)

Uploaded Python 3

File details

Details for the file aiguard-0.1.1.tar.gz.

File metadata

  • Download URL: aiguard-0.1.1.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.0

File hashes

Hashes for aiguard-0.1.1.tar.gz
Algorithm Hash digest
SHA256 18481df55b76700165e9d999c79d9bb7a2f1b3b29c48f47ff056d091c1612ff5
MD5 1cd18b2151a6c19c6c47a08f239b1cfd
BLAKE2b-256 0fb6a3f2e38947dd7e466740990a49eb82c51da96228134d6b3941f12c5ec709

See more details on using hashes here.

File details

Details for the file AIGuard-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: AIGuard-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 3.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.0

File hashes

Hashes for AIGuard-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 56f54f8f092776842595decec12ffc6191199c5c9813a377d457a956b753d9a5
MD5 c4cff3ec1fdd2533b065ec558d18a6c0
BLAKE2b-256 41a30d3f34223c192677cd8effc3bb56297349f25adf29fb4c5667a17b90992a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page