Skip to main content

A system to protect AI from misuse and harmful inputs.

Project description

AI Guard (Beta)

AI Guard is designed to protect AI Systems from misuse, manipulation, and harmful inputs. It ensures that the AI behaves as intended and does not deviate from its guidelines or generate harmful content.

Features

  • Input Sanitization
  • Prompt Classification
  • Manipulation Detection
  • Ethical Compliance Check
  • Response Validation
  • Logging and Monitoring
  • Customizable Settings
  • Interactive Testing Mode (Uses Ollama)
  • Fallback Mechanisms
  • Scalability and Extensability

How It Works

AI Guard protects AI Systems from misuse and harmful inputs by sanitizing user prompts to remove noise, classifying them as "non-toxic" (safe) or "toxic" (malicious) using a machine learning model and detecting manipulation attempts through predefined rules. It ensures ethical compliance by blocking prompts that request harmful or illegal actions and validates AI responses to prevent inappropriate content. The system logs all activities for auditing, operates interactively for testing, and includes a fallback mechanism to handle ambiguous cases by defaulting to "non-toxic". You can customize this however you'd like via the settings.yml file it supports features like input sanitization, prompt classification, and response validation making it ideal for chatbots, ai assistants, content moderation and research!

Settings and Manipulation Rules

If you find any form of manipulation rule that can manipulate the AI assistant in any form of way you can add that said prompt to the manipulation_rules.txt file and you can open an issue stating a bypass you found. There are currently 583 different manipulation rules as of the first beta release and we're always looking to add more to this list that can really keep all AI safe from unrightful prompt engineering.

If you wish to tinker around with it and enable or disable whatever you dont like with in the whole system, you can head to the settings.yml file and enable or disable things. You can even go ahead and change the logs file to a whole different file if you wish.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiguard-0.1.0.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

AIGuard-0.1.0-py3-none-any.whl (3.2 kB view details)

Uploaded Python 3

File details

Details for the file aiguard-0.1.0.tar.gz.

File metadata

  • Download URL: aiguard-0.1.0.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.0

File hashes

Hashes for aiguard-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c9a8c61e905bb3b05432a8b52fe6ac9d32a8371d90202a4a52d5b19621d5a1dc
MD5 ee37a31c48eec32ce3e6121f78596578
BLAKE2b-256 9d0538b0e7e87a0d02bc2611680dd61b4a4e255291e30d678857b6cd507c2efd

See more details on using hashes here.

File details

Details for the file AIGuard-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: AIGuard-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.0

File hashes

Hashes for AIGuard-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 743a1c8359d6700672b54af38eb7918dc6fa481eef7a0a8fca758716b4643f09
MD5 d0f94723e7865f2901dc1e5acc30fe56
BLAKE2b-256 cfa3447a9030c41199eabca7903484a0e884e89c978114d855c5d1e75132ec7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page