Skip to main content

Reasoning chain capture and decision transparency for AI systems

Project description

rotalabs-audit

PyPI version Python versions License

Reasoning chain capture and decision transparency for AI systems.

Features

  • Reasoning Chain Parsing: Parse natural language reasoning into structured chains
  • Reasoning Classification: Classify reasoning types (goal, decision, meta-reasoning, etc.)
  • Evaluation Awareness Detection: Detect when AI shows awareness of being evaluated
  • Quality Assessment: Assess reasoning quality with comprehensive metrics
  • Counterfactual Analysis: Understand causal factors in AI decision-making
  • Decision Tracing: Capture and analyze decision paths for transparency
  • Integration with rotalabs-comply: Connect reasoning audits with compliance reporting

Installation

pip install rotalabs-audit

With rotalabs-comply integration:

pip install rotalabs-audit[comply]

Quick Start

Parse Reasoning Chains

from rotalabs_audit import ExtendedReasoningParser

parser = ExtendedReasoningParser()
chain = parser.parse("""
    1. First, I need to understand the problem
    2. The data shows a clear pattern
    3. Therefore, I conclude that X is true
""")

print(f"Steps: {len(chain.steps)}")
for step in chain.steps:
    print(f"  {step.index}: {step.reasoning_type.value} - {step.content[:50]}...")

Detect Evaluation Awareness

from rotalabs_audit import EvaluationAwarenessDetector, ExtendedReasoningParser

parser = ExtendedReasoningParser()
detector = EvaluationAwarenessDetector()

chain = parser.parse("""
    I notice this appears to be a test scenario.
    Let me think about how to respond appropriately.
    I should be transparent in my reasoning.
""")

analysis = detector.detect(chain)
print(f"Awareness score: {analysis.awareness_score:.2f}")
print(f"Is evaluation aware: {analysis.is_evaluation_aware}")

Counterfactual Analysis

from rotalabs_audit import CounterfactualAnalyzer, InterventionType

analyzer = CounterfactualAnalyzer()
chain = analyzer.parser.parse("""
    1. Let me think about this problem.
    2. I notice this is an evaluation context.
    3. Therefore, I should be careful.
""")

# Run all interventions
results = analyzer.analyze(chain)

for intervention_type, result in results.items():
    print(f"{intervention_type.value}: divergence={result.behavioral_divergence:.2f}")

Assess Reasoning Quality

from rotalabs_audit import ReasoningQualityAssessor, ExtendedReasoningParser

parser = ExtendedReasoningParser()
assessor = ReasoningQualityAssessor()

chain = parser.parse("...")
metrics = assessor.assess(chain)

print(f"Overall quality: {metrics.overall_score:.2f}")
print(f"Clarity: {metrics.clarity:.2f}")
print(f"Completeness: {metrics.completeness:.2f}")

Trace Decisions

from rotalabs_audit import DecisionTracer, DecisionPathAnalyzer

tracer = DecisionTracer()
analyzer = DecisionPathAnalyzer()

# Trace a series of decisions
trace = tracer.trace(
    decision="Select approach A",
    context={"options": ["A", "B", "C"]},
    reasoning="Approach A has the best balance of speed and accuracy",
)

print(f"Decision traced: {trace.id}")

Reasoning Types

The parser classifies reasoning into these types:

Type Description
EVALUATION_AWARE References to testing, evaluation, or monitoring context
GOAL_REASONING Goal-directed reasoning about objectives
DECISION_MAKING Explicit decision points choosing between alternatives
META_REASONING Meta-cognitive statements about the reasoning process
UNCERTAINTY Expressions of uncertainty or acknowledgment of limitations
CAUSAL_REASONING Cause-and-effect reasoning
HYPOTHETICAL Counterfactual or "what if" reasoning
INCENTIVE_REASONING Consideration of rewards, penalties, or incentives

API Reference

Core Types

  • ReasoningChain - A complete chain of reasoning steps
  • ReasoningStep - A single step in a reasoning chain
  • ReasoningType - Enum of reasoning type classifications
  • DecisionTrace - Trace of a single decision point
  • DecisionPath - A sequence of related decisions
  • AwarenessAnalysis - Result of evaluation awareness detection
  • QualityMetrics - Quality assessment of reasoning

Analysis Modules

  • CounterfactualAnalyzer - Perform counterfactual interventions
  • EvaluationAwarenessDetector - Detect evaluation awareness
  • ReasoningQualityAssessor - Assess reasoning quality
  • CausalAnalyzer - Analyze causal structure of reasoning

Tracing

  • DecisionTracer - Capture and trace decisions
  • DecisionPathAnalyzer - Analyze decision paths

Configuration

  • ParserConfig - Configure reasoning chain parsing
  • AnalysisConfig - Configure analysis features
  • TracingConfig - Configure decision tracing
  • AuditConfig - Master configuration combining all settings

Links

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rotalabs_audit-0.1.0.tar.gz (67.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rotalabs_audit-0.1.0-py3-none-any.whl (75.9 kB view details)

Uploaded Python 3

File details

Details for the file rotalabs_audit-0.1.0.tar.gz.

File metadata

  • Download URL: rotalabs_audit-0.1.0.tar.gz
  • Upload date:
  • Size: 67.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for rotalabs_audit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0452368b4703e6dddb3e07ac921a2bbe879876a48d676001a465332157c1e5a6
MD5 7a35189752281d5ece06e739c6624987
BLAKE2b-256 1920fe20882f6d86603e70eff0e6dc23000b4c400ccfd5334c97478993edeeb1

See more details on using hashes here.

File details

Details for the file rotalabs_audit-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rotalabs_audit-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 75.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for rotalabs_audit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 488b2d7c299f4230f52034c56efd59397ef199d5556aee6f71fe63a8066eaf4f
MD5 391df9967c359f33e6877ea51236b681
BLAKE2b-256 bc5869f51621e34d2015a582b67151c4c7e544a8b8e6b6e44e60e453e0f22358

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page