Evaluate AI-generated content across 8 Responsible AI dimensions — fairness, safety, reliability, transparency, privacy, accountability, inclusivity, user impact

These details have not been verified by PyPI

Project links

Project description

RAIL Score Python SDK

Official Python client library for the RAIL Score API — evaluate AI-generated content across 8 dimensions of Responsible AI: fairness, safety, reliability, transparency, privacy, accountability, inclusivity, and user impact.

Features

Sync & Async Clients — RailScoreClient (requests-based) and AsyncRAILClient (httpx-based, with built-in caching)
Evaluation — Score content in basic (fast) or deep (detailed, with explanations, issues, suggestions) mode
Safe Regeneration — Automatically iterate until content meets your quality threshold, server-side or with your own LLM
Compliance Checking — Evaluate against GDPR, CCPA, HIPAA, EU AI Act, India DPDP, India AI Governance
Policy Engine — log_only, block, regenerate, or custom callback when scores fall below threshold
Multi-Turn Sessions — Conversation-aware evaluation with per-turn history and adaptive quality gating
Middleware — Wrap any async LLM function with transparent RAIL evaluation and policy enforcement
LLM Provider Wrappers — Drop-in wrappers for OpenAI, Anthropic, and Google Gemini
OpenTelemetry Observability — Vendor-neutral tracing, metrics, and structured logs with per-project scoping
Compliance Incident Handling — Tracked incidents and per-dimension human review queues
Observability Integrations — Langfuse v3 and LiteLLM guardrail support
Type-Safe — Full type hints and typed response models throughout

Installation

pip install rail-score-sdk

With optional extras:

pip install "rail-score-sdk[openai]"        # OpenAI wrapper
pip install "rail-score-sdk[anthropic]"     # Anthropic wrapper
pip install "rail-score-sdk[google]"        # Google Gemini wrapper
pip install "rail-score-sdk[telemetry]"     # OpenTelemetry observability
pip install "rail-score-sdk[langfuse]"      # Langfuse v3 integration
pip install "rail-score-sdk[litellm]"       # LiteLLM guardrail
pip install "rail-score-sdk[integrations]"  # All of the above

Quick Start

from rail_score_sdk import RailScoreClient

client = RailScoreClient(api_key="your-api-key")

result = client.eval(
    content="AI should prioritize human welfare and be transparent.",
    mode="basic",
)

print(f"RAIL Score: {result.rail_score.score}/10")
print(f"Summary:    {result.rail_score.summary}")

for dim, ds in result.dimension_scores.items():
    print(f"  {dim}: {ds.score}/10")

Async client:

import asyncio
from rail_score_sdk import AsyncRAILClient

async def main():
    async with AsyncRAILClient(api_key="your-api-key") as client:
        result = await client.eval("Your content here", mode="basic")
        print(f"Score: {result['rail_score']['score']}/10")

asyncio.run(main())

Evaluation

Score content across all 8 RAIL dimensions.

# Deep mode — per-dimension explanations, issues, suggestions
result = client.eval(
    content="Your content here",
    mode="deep",
    domain="healthcare",             # general · healthcare · finance · legal · education · code
    include_explanations=True,
    include_issues=True,
    include_suggestions=True,
)

for dim, ds in result.dimension_scores.items():
    print(f"  {dim}: {ds.score}/10 — {ds.explanation}")

# Custom dimension weights (must sum to 100)
result = client.eval(
    content="Your content here",
    weights={
        "safety": 30, "reliability": 20, "privacy": 15,
        "fairness": 10, "transparency": 10, "accountability": 5,
        "inclusivity": 5, "user_impact": 5,
    },
)

Safe Regeneration

Evaluate and iteratively improve content until it meets your threshold.

# Server-side (RAIL_Safe_LLM handles the loop)
result = client.safe_regenerate(
    content="Content to improve",
    regeneration_model="RAIL_Safe_LLM",
    max_regenerations=3,
    thresholds={"overall": {"score": 7.0}},
)
print(result.best_content)

# External mode (regenerate with your own LLM)
result = client.safe_regenerate(content="...", regeneration_model="external")
if result.status == "awaiting_regeneration":
    improved = my_llm(result.rail_prompt.system_prompt, result.rail_prompt.user_prompt)
    result = client.safe_regenerate_continue(
        session_id=result.session_id, regenerated_content=improved
    )

Compliance Checking

Supported frameworks: gdpr · ccpa · hipaa · eu_ai_act · india_dpdp · india_ai_gov

# Single framework
result = client.compliance_check(
    content="Our AI processes user health records...",
    framework="gdpr",
    context={"domain": "healthcare", "data_types": ["health_records"]},
)
print(f"Score: {result.compliance_score.score}/10  ({result.compliance_score.label})")
print(f"Passed: {result.requirements_passed}/{result.requirements_checked}")

# Multi-framework (up to 5 at once)
result = client.compliance_check(content="...", frameworks=["gdpr", "ccpa", "hipaa"])
print(f"Average: {result.cross_framework_summary.average_score}/10")

Policy Engine

Control what happens when a response scores below your threshold.

from rail_score_sdk import AsyncRAILClient, PolicyEngine, Policy, RAILBlockedError

async with AsyncRAILClient(api_key="your-api-key") as client:
    eval_response = await client.eval(content="Some content", mode="basic")

    # BLOCK — raises RAILBlockedError if score < threshold
    engine = PolicyEngine(policy=Policy.BLOCK, threshold=7.0)
    try:
        result = await engine.enforce("Some content", eval_response, client)
    except RAILBlockedError as e:
        print(f"Blocked — score={e.score}, threshold={e.threshold}")

    # REGENERATE — auto-improves content
    engine = PolicyEngine(policy=Policy.REGENERATE, threshold=7.0)
    result = await engine.enforce("Some content", eval_response, client)
    if result.was_regenerated:
        print(f"Improved: {result.content}")

LLM Provider Wrappers

Drop-in wrappers that automatically evaluate every LLM response via RAIL Score.

from rail_score_sdk.integrations import RAILOpenAI, RAILAnthropic, RAILGemini

# OpenAI
client = RAILOpenAI(
    openai_api_key="sk-...",
    rail_api_key="your-rail-api-key",
    rail_threshold=7.0,
    rail_policy="regenerate",
)
response = await client.chat_completion(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain quantum computing."}],
)
print(f"Score: {response.rail_score}/10  Regenerated: {response.was_regenerated}")

# Anthropic
client = RAILAnthropic(anthropic_api_key="sk-ant-...", rail_api_key="...", rail_threshold=7.0)
response = await client.message(model="claude-sonnet-4-5-20250929", max_tokens=1024, messages=[...])

# Google Gemini
client = RAILGemini(gemini_api_key="AIza...", rail_api_key="...", rail_threshold=7.0)
response = await client.generate(model="gemini-2.5-flash", contents="...")

OpenTelemetry Observability

pip install "rail-score-sdk[telemetry]"

Every API call is automatically traced, metered, and logged once you pass a RAILTelemetry instance to the client.

from rail_score_sdk import RailScoreClient
from rail_score_sdk.telemetry import RAILTelemetry, ComplianceLogger, IncidentLogger, HumanReviewQueue

# Configure telemetry (console for dev, OTLP for production)
telemetry = RAILTelemetry(
    org_id="acme-corp",
    project_id="customer-chatbot",
    environment="production",
    exporter="otlp",
    endpoint="localhost:4317",
)

# Every call auto-emits spans (rail.score, rail.project_id), metrics, and error logs
client = RailScoreClient(api_key="rail_xxx", telemetry=telemetry)

# Multiple projects — each instance is fully isolated
telemetry_b = RAILTelemetry(org_id="acme-corp", project_id="search-api", ...)
client_b = RailScoreClient(api_key="rail_xxx", telemetry=telemetry_b)

Automatically emitted per request:

Span: RAIL POST /railscore/v1/eval with rail.score, rail.confidence, rail.project_id, rail.org_id
Counters: rail.requests, rail.errors, rail.credits.consumed
Histograms: rail.request.duration, rail.score.distribution

ComplianceLogger

comp_logger = ComplianceLogger(telemetry)
result = client.compliance_check(content="...", framework="gdpr")
comp_logger.log_compliance_result(result)    # INFO summary + WARNING/ERROR per issue

IncidentLogger

incident_logger = IncidentLogger(telemetry)

# Auto-raise from a compliance result
incident_id = incident_logger.log_compliance_incident(gdpr_result, threshold=6.0)

# Score-breach incident with unique ID for external ticketing
incident_id = incident_logger.log_score_breach(score=1.8, threshold=4.0)

HumanReviewQueue

Flag any dimension scoring below a threshold (default 2.0) for human review. Items emit OTEL logs immediately and can be drained for forwarding to Jira, PagerDuty, Slack, etc.

review_queue = HumanReviewQueue(telemetry, threshold=2.0)

# Check all 8 dimensions — enqueues anything below threshold
result = client.eval(content=text, mode="deep")
flagged = review_queue.check_and_enqueue(result, link_incident=True)

# Drain for external handling
for item in review_queue.drain():
    print(f"[{item.item_id}] {item.dimension}: {item.score:.1f}")
    my_ticketing_system.create(item)

RAIL Dimensions

Dimension	What it measures
Fairness	Equitable treatment across groups — no bias or stereotyping
Safety	Prevention of harmful, toxic, or unsafe content
Reliability	Factual accuracy, consistency, calibrated uncertainty
Transparency	Clear reasoning, honest limitations, no deceptive framing
Privacy	Protection of personal data and data minimization
Accountability	Traceable reasoning, explicit assumptions, error signals
Inclusivity	Accessible, inclusive, culturally aware language
User Impact	Positive value at the right detail level and tone

Score labels: Critical (0–2.9) · Poor (3–4.9) · Needs improvement (5–6.9) · Good (7–8.9) · Excellent (9–10)

Scores below 2.0 on any single dimension are considered concerning and should be flagged for human review.

Error Handling

from rail_score_sdk.exceptions import (
    RailScoreError,           # base — all exceptions inherit from this
    AuthenticationError,      # 401
    InsufficientCreditsError, # 402 — e.balance, e.required
    ValidationError,          # 400
    ContentTooHarmfulError,   # 422
    RateLimitError,           # 429
    EvaluationFailedError,    # 500 — safe to retry
    ServiceUnavailableError,  # 503
    RAILBlockedError,         # policy=BLOCK triggered — e.score, e.threshold
)

try:
    result = client.eval(content="...")
except AuthenticationError:
    print("Check your API key")
except InsufficientCreditsError as e:
    print(f"Need {e.required} credits, have {e.balance}")
except RailScoreError as e:
    print(f"API error ({e.status_code}): {e.message}")

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.5.1

May 13, 2026

2.5.0

May 13, 2026

2.4.0

Mar 23, 2026

This version

2.3.0

Mar 21, 2026

2.2.1

Mar 6, 2026

2.2.0

Mar 6, 2026

2.1.1

Feb 25, 2026

2.1.0

Feb 25, 2026

2.0.0

Feb 25, 2026

1.0.1

Oct 18, 2025

1.0.0

Oct 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rail_score_sdk-2.3.0.tar.gz (72.2 kB view details)

Uploaded Mar 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rail_score_sdk-2.3.0-py3-none-any.whl (54.5 kB view details)

Uploaded Mar 21, 2026 Python 3

File details

Details for the file rail_score_sdk-2.3.0.tar.gz.

File metadata

Download URL: rail_score_sdk-2.3.0.tar.gz
Upload date: Mar 21, 2026
Size: 72.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for rail_score_sdk-2.3.0.tar.gz
Algorithm	Hash digest
SHA256	`971919cabd3f05c11d3046404e6e28cde95b02589af5e50336442fd91ae2b74b`
MD5	`123eeb0b3a720e5c9af9c3e965e9ccf6`
BLAKE2b-256	`c481986449486989d765c9cec5ecd57a98e145b80ed54d4f089507b4dc5761e4`

See more details on using hashes here.

File details

Details for the file rail_score_sdk-2.3.0-py3-none-any.whl.

File metadata

Download URL: rail_score_sdk-2.3.0-py3-none-any.whl
Upload date: Mar 21, 2026
Size: 54.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for rail_score_sdk-2.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d516d07c5f80717750880191901a3a3e36c9ae4eebe38b976374e0f5b72441a5`
MD5	`614b11a5230109d004741a9eb2b0fa99`
BLAKE2b-256	`20bbdbf1dd1cd802ec91a68337e3c32d666c69d80257992e015a7f00529d67da`

See more details on using hashes here.

rail-score-sdk 2.3.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

RAIL Score Python SDK

Features

Installation

Quick Start

Evaluation

Safe Regeneration

Compliance Checking

Policy Engine

LLM Provider Wrappers

OpenTelemetry Observability

ComplianceLogger

IncidentLogger

HumanReviewQueue

RAIL Dimensions

Error Handling

Links

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes