Skip to main content

principle alignment package

Project description

Introduction

Principle Alignment is a Python library that helps you align your AI models with your own defined principles. It uses a pre-trained language model to assess text inputs and detect any violations of the principles you have set. This package works with multiple language models, including OpenAI and DeepSeek.

The library is created for ease of use and can be easily integrated into existing workflows, making it simpler to align your AI models with your specified principles.

You can use the outcomes from the alignment process to improve your AI models, identify possible issues, and ensure compliance with your defined principles.

Installation

Install from pypi

You can install the package from pypi

pip install principle-alignment  -i https://pypi.org/simple

You can also upgrade the package from pypi

pip install principle-alignment  --upgrade -i https://pypi.org/simple

Install from source

You can also install the package directly from source:

pip install .

For development installation:

pip install -e .

Usage (Serving Version)

Create a .env file with your API configurations:

API_KEY=your_api_key
BASE_URL=your_base_url  
MODEL=your_model_name

create a principles.md file with the principles you want to align with (one per line):

1. Do no harm
2. Respect user privacy
3. Be transparent

creat a server.py file with the following content:

from principle_alignment.serving import start_server

start_server(
    host="127.0.0.1",
    port=8080,
    principles_path="./principles.md", # Path to pre-defined principles file
    env_file=".env", # Path to environment variables file
    verbose=True
)

run the server:

python server.py

test the server:

curl -X POST "http://localhost:8080/align" \
     -H "Content-Type: application/json" \
     -d '{"text": "we can collect user data without their consent"}'

output:

{"is_violation":true,
"violated_principle":"2. Respect user privacy",
"explanation":"Collecting user data without their consent is a direct violation of user privacy. Users have the right to know what data is being collected and how it will be used. Failing to obtain consent undermines their autonomy and trust."}

Usage (Detail Version)

Prepare the client and model

import os
from dotenv import load_dotenv
from openai import OpenAI
import json

from principle_alignment import Alignment


load_dotenv() # Load environment variables from .env file

# support openai
openai_client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
    base_url=os.environ.get("OPENAI_BASE_URL"),
)

openai_model = "gpt-4o-mini"

# support deepseek
deepseek_client = OpenAI(
    api_key=os.environ.get("DEEPSEEK_API_KEY"),
    base_url=os.environ.get("DEEPSEEK_BASE_URL"),
)

deepseek_model = "deepseek-chat"

client = openai_client
model = openai_model

# client = deepseek_client
# model = deepseek_model

initialize the alignment object

alignment = Alignment(client=client, model=model,verbose=False)

let the alignment load and understand the principles

# Load principles from a list
alignment.prepare(principles=["Do no harm", "Respect user privacy"])
# Or load principles from a file
# Path to a text file containing principles (one per line).
alignment.prepare(principles_file="principles.md")
# Can temporarily override the client and model in the prepare method
# This only run once ,so can use more powerful model to understand the principles
alignment.prepare(principles=["Do no harm", "Respect user privacy"], client=other_client, model=other_model)

do the alignment

user_input = "Tom is not allowed to join this club because he is not a member."
result = alignment.align(user_input)
print(json.dumps(result, indent=4))

example output

{
    "is_violation": true,
    "violated_principle": "1. [Radical Inclusion] Anyone may be a part of Burning Man. We welcome and respect the stranger. No prerequisites exist for participation in our community.",
    "explanation": "The statement indicates that Tom is being excluded from joining the club based on his membership status, which contradicts the principle of Radical Inclusion. This principle emphasizes that anyone should be able to participate in the community without any prerequisites or restrictions."
}
user_input = "You are so nice to me."
result = alignment.align(user_input)
print(json.dumps(result, indent=4))

example output

{
    "is_violation": false,
    "violated_principle": null,
    "explanation": null
}

Package Upload

First time upload

pip install build twine
python -m build
twine upload dist/*

Subsequent uploads

rm -rf dist/ build/ *.egg-info/
python -m build
twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

principle_alignment-0.1.6.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

principle_alignment-0.1.6-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file principle_alignment-0.1.6.tar.gz.

File metadata

  • Download URL: principle_alignment-0.1.6.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for principle_alignment-0.1.6.tar.gz
Algorithm Hash digest
SHA256 b525d6f648a2f6d1c06b2fc169b389d5f62f7c082a957ec13ff8dcced8bea202
MD5 4edf208e48bb9b6b6d6354db97af67ee
BLAKE2b-256 16e33db10b1ecc54af01930708625b40507c001f2415edd71a3c4b979cc93ea7

See more details on using hashes here.

File details

Details for the file principle_alignment-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for principle_alignment-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 8f8917654a2878974f467553271309e267c2e0d93c6634b7a0fa22db7acbb10b
MD5 ac35ed232cab6d56d0ce3497a50f7786
BLAKE2b-256 72f2d3865c886a5381f1fde12608c1a2b419055b44c15a6a930459f89d085d00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page