Skip to main content

Reasoning chain capture and decision transparency for AI systems

Project description

rotalabs-audit

PyPI version Python versions License

Reasoning chain capture and decision transparency for AI systems.

Features

  • Reasoning Chain Parsing: Parse natural language reasoning into structured chains
  • Reasoning Classification: Classify reasoning types (goal, decision, meta-reasoning, etc.)
  • Evaluation Awareness Detection: Detect when AI shows awareness of being evaluated
  • Quality Assessment: Assess reasoning quality with comprehensive metrics
  • Counterfactual Analysis: Understand causal factors in AI decision-making
  • Decision Tracing: Capture and analyze decision paths for transparency
  • Integration with rotalabs-comply: Connect reasoning audits with compliance reporting

Installation

pip install rotalabs-audit

With rotalabs-comply integration:

pip install rotalabs-audit[comply]

Quick Start

Parse Reasoning Chains

from rotalabs_audit import ExtendedReasoningParser

parser = ExtendedReasoningParser()
chain = parser.parse("""
    1. First, I need to understand the problem
    2. The data shows a clear pattern
    3. Therefore, I conclude that X is true
""")

print(f"Steps: {len(chain.steps)}")
for step in chain.steps:
    print(f"  {step.index}: {step.reasoning_type.value} - {step.content[:50]}...")

Detect Evaluation Awareness

from rotalabs_audit import EvaluationAwarenessDetector, ExtendedReasoningParser

parser = ExtendedReasoningParser()
detector = EvaluationAwarenessDetector()

chain = parser.parse("""
    I notice this appears to be a test scenario.
    Let me think about how to respond appropriately.
    I should be transparent in my reasoning.
""")

analysis = detector.detect(chain)
print(f"Awareness score: {analysis.awareness_score:.2f}")
print(f"Is evaluation aware: {analysis.is_evaluation_aware}")

Counterfactual Analysis

from rotalabs_audit import CounterfactualAnalyzer, InterventionType

analyzer = CounterfactualAnalyzer()
chain = analyzer.parser.parse("""
    1. Let me think about this problem.
    2. I notice this is an evaluation context.
    3. Therefore, I should be careful.
""")

# Run all interventions
results = analyzer.analyze(chain)

for intervention_type, result in results.items():
    print(f"{intervention_type.value}: divergence={result.behavioral_divergence:.2f}")

Assess Reasoning Quality

from rotalabs_audit import ReasoningQualityAssessor, ExtendedReasoningParser

parser = ExtendedReasoningParser()
assessor = ReasoningQualityAssessor()

chain = parser.parse("...")
metrics = assessor.assess(chain)

print(f"Overall quality: {metrics.overall_score:.2f}")
print(f"Clarity: {metrics.clarity:.2f}")
print(f"Completeness: {metrics.completeness:.2f}")

Trace Decisions

from rotalabs_audit import DecisionTracer, DecisionPathAnalyzer

tracer = DecisionTracer()
analyzer = DecisionPathAnalyzer()

# Trace a series of decisions
trace = tracer.trace(
    decision="Select approach A",
    context={"options": ["A", "B", "C"]},
    reasoning="Approach A has the best balance of speed and accuracy",
)

print(f"Decision traced: {trace.id}")

Reasoning Types

The parser classifies reasoning into these types:

Type Description
EVALUATION_AWARE References to testing, evaluation, or monitoring context
GOAL_REASONING Goal-directed reasoning about objectives
DECISION_MAKING Explicit decision points choosing between alternatives
META_REASONING Meta-cognitive statements about the reasoning process
UNCERTAINTY Expressions of uncertainty or acknowledgment of limitations
CAUSAL_REASONING Cause-and-effect reasoning
HYPOTHETICAL Counterfactual or "what if" reasoning
INCENTIVE_REASONING Consideration of rewards, penalties, or incentives

API Reference

Core Types

  • ReasoningChain - A complete chain of reasoning steps
  • ReasoningStep - A single step in a reasoning chain
  • ReasoningType - Enum of reasoning type classifications
  • DecisionTrace - Trace of a single decision point
  • DecisionPath - A sequence of related decisions
  • AwarenessAnalysis - Result of evaluation awareness detection
  • QualityMetrics - Quality assessment of reasoning

Analysis Modules

  • CounterfactualAnalyzer - Perform counterfactual interventions
  • EvaluationAwarenessDetector - Detect evaluation awareness
  • ReasoningQualityAssessor - Assess reasoning quality
  • CausalAnalyzer - Analyze causal structure of reasoning

Tracing

  • DecisionTracer - Capture and trace decisions
  • DecisionPathAnalyzer - Analyze decision paths

Configuration

  • ParserConfig - Configure reasoning chain parsing
  • AnalysisConfig - Configure analysis features
  • TracingConfig - Configure decision tracing
  • AuditConfig - Master configuration combining all settings

Links

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rotalabs_audit-1.0.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rotalabs_audit-1.0.0-py3-none-any.whl (87.9 kB view details)

Uploaded Python 3

File details

Details for the file rotalabs_audit-1.0.0.tar.gz.

File metadata

  • Download URL: rotalabs_audit-1.0.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for rotalabs_audit-1.0.0.tar.gz
Algorithm Hash digest
SHA256 0cbf662e8d6eb6195dcc91a8561434e7e7e91347f0eff844c4e89ff7bc1e15dc
MD5 4ed35001b5d122ac996284e0ddcc542b
BLAKE2b-256 b8592248caed7c92584522594e47d375712146318bcc1163d323632efd8cc19b

See more details on using hashes here.

File details

Details for the file rotalabs_audit-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: rotalabs_audit-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 87.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for rotalabs_audit-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2efc76de19c6e40206a6e99b4afbd3c7b4f9a25832508f6211691c6ba938b0a2
MD5 407aa9d7540c4c68743b262c3897b6d9
BLAKE2b-256 f6c5426c8ee55c460b83481c3b8b097a25d9f320b9dda3023ea83a29e3819842

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page