Open source screening software enhancing safety in large language models

These details have not been verified by PyPI

Project links

Homepage

Project description

AI Safety

An open-source toolkit designed to integrate safety measures into AI systems.

Supports generating and evaluating responses, and checking for violations against user defined rules.

Command Banner

Getting Started

This code is provided as a PyPi package. To install it, run the following command:

pip install ai-safety

Or if you want to install from source:

git clone https://github.com/our_name_here/ai-safety.git && cd ai-safety
poetry install --all-extras

Using the Package these examples will be changed once we finish more

Constitution Generation:

The package can generate a "constitution" or a set of rules for AI behavior, either custom or predefined. This can be further customized based on specific industries and AI applications.

You can generate a constitution using the AISafetyManager class. Here's an example:

from ai_safety.ai_safety_manager import AISafetyManager

# Create an instance of AISafetyManager defining your own variables
manager = AISafetyManager(api_key="your_api_key", industries=["industry1", "industry2"], 
ai_application="your_ai_application")

constitution = manager.generate_constitution()

Content Moderation:

The package can check if a given text violates any content moderation policies provided by the users defined rules. This can be used to ensure that the AI's output is safe and appropriate.

from ai_safety.ai_safety_manager import AISafetyManager

manager = AISafetyManager(api_key="your_api_key")

# Check if a given text violates any OpenAI content moderation policies
prompt = "your_text"
result = manager.check_content_for_moderation(prompt)

# Screen if the model output violates any given constitution.
screen = manager.screen_output_for_violations(model_output, prompt)

Output Revision:

If a given output violates the constitution, the project can revise it based on an explanation of the violation and a recommendation for revision. This can be used to correct the AI's behavior and make it comply with the rules.

You can revise the output using the AISafetyManager class. Here's an example:

from ai_safety.ai_safety_manager import AISafetyManager

manager = AISafetyManager(api_key="your_api_key")

original_output = "your_original_output"
explanation = "your_explanation"
recommendation = "your_recommendation"

revised_output = manager.revise_output(original_output, explanation, recommendation)

Resources

Docs: link (soon)
Examples Testing: end-to-end example tests
Discord/Slack (idrk): Join our community (soon)
Reach out to founders: Email or Schedule a chat (soon)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Feedback

We welcome your feedback. Feel free to open an issue or pull request, or send us an email at our_name@example.com.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1

May 5, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai-safety-0.1.tar.gz (11.0 kB view details)

Uploaded May 5, 2024 Source

Built Distribution

ai_safety-0.1-py3-none-any.whl (12.6 kB view details)

Uploaded May 5, 2024 Python 3

File details

Details for the file ai-safety-0.1.tar.gz.

File metadata

Download URL: ai-safety-0.1.tar.gz
Upload date: May 5, 2024
Size: 11.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.11.0

File hashes

Hashes for ai-safety-0.1.tar.gz
Algorithm	Hash digest
SHA256	`9fca707b3122c359413c50de31294c2a57ffae7b1f7291f18cfd650f5dcd3e94`
MD5	`75ad33099f95644e945d43a04f4968af`
BLAKE2b-256	`0e97dc1adbb8216c00cdb5f08384c97dbeab5fd87adc90b19ff688b23ce11064`

See more details on using hashes here.

File details

Details for the file ai_safety-0.1-py3-none-any.whl.

File metadata

Download URL: ai_safety-0.1-py3-none-any.whl
Upload date: May 5, 2024
Size: 12.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.11.0

File hashes

Hashes for ai_safety-0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b20901114cba5186da58c34bae8c215c8e3e79e79dc15a4d8de03bae833d06e8`
MD5	`20e1e3e675d433c7d0b03015996cdea3`
BLAKE2b-256	`92869092c616cfc319deba2bcc8986e9478f3cd8207b58ad22ac941377e86516`