A system to protect AI from misuse and harmful inputs.
Project description
AI Guard (Beta)
AI Guard is designed to protect AI Systems from misuse, manipulation, and harmful inputs. It ensures that the AI behaves as intended and does not deviate from its guidelines or generate harmful content.
Features
- Input Sanitization
- Prompt Classification
- Manipulation Detection
- Ethical Compliance Check
- Response Validation
- Logging and Monitoring
- Customizable Settings
- Interactive Testing Mode (Uses Ollama)
- Fallback Mechanisms
- Scalability and Extensability
How It Works
AI Guard protects AI Systems from misuse and harmful inputs by sanitizing user prompts to remove noise, classifying them as "non-toxic" (safe) or "toxic" (malicious) using a machine learning model and detecting manipulation attempts through predefined rules. It ensures ethical compliance by blocking prompts that request harmful or illegal actions and validates AI responses to prevent inappropriate content. The system logs all activities for auditing, operates interactively for testing, and includes a fallback mechanism to handle ambiguous cases by defaulting to "non-toxic". You can customize this however you'd like via the settings.yml file it supports features like input sanitization, prompt classification, and response validation making it ideal for chatbots, ai assistants, content moderation and research!
Settings and Manipulation Rules
If you find any form of manipulation rule that can manipulate the AI assistant in any form of way you can add that said prompt to the manipulation_rules.txt file and you can open an issue stating a bypass you found. There are currently 583 different manipulation rules as of the first beta release and we're always looking to add more to this list that can really keep all AI safe from unrightful prompt engineering.
If you wish to tinker around with it and enable or disable whatever you dont like with in the whole system, you can head to the settings.yml file and enable or disable things. You can even go ahead and change the logs file to a whole different file if you wish.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aiguard-0.1.0.tar.gz.
File metadata
- Download URL: aiguard-0.1.0.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9a8c61e905bb3b05432a8b52fe6ac9d32a8371d90202a4a52d5b19621d5a1dc
|
|
| MD5 |
ee37a31c48eec32ce3e6121f78596578
|
|
| BLAKE2b-256 |
9d0538b0e7e87a0d02bc2611680dd61b4a4e255291e30d678857b6cd507c2efd
|
File details
Details for the file AIGuard-0.1.0-py3-none-any.whl.
File metadata
- Download URL: AIGuard-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
743a1c8359d6700672b54af38eb7918dc6fa481eef7a0a8fca758716b4643f09
|
|
| MD5 |
d0f94723e7865f2901dc1e5acc30fe56
|
|
| BLAKE2b-256 |
cfa3447a9030c41199eabca7903484a0e884e89c978114d855c5d1e75132ec7e
|