Reasoning chain capture and decision transparency for AI systems
Project description
rotalabs-audit
Reasoning chain capture and decision transparency for AI systems.
Features
- Reasoning Chain Parsing: Parse natural language reasoning into structured chains
- Reasoning Classification: Classify reasoning types (goal, decision, meta-reasoning, etc.)
- Evaluation Awareness Detection: Detect when AI shows awareness of being evaluated
- Quality Assessment: Assess reasoning quality with comprehensive metrics
- Counterfactual Analysis: Understand causal factors in AI decision-making
- Decision Tracing: Capture and analyze decision paths for transparency
- Integration with rotalabs-comply: Connect reasoning audits with compliance reporting
Installation
pip install rotalabs-audit
With rotalabs-comply integration:
pip install rotalabs-audit[comply]
Quick Start
Parse Reasoning Chains
from rotalabs_audit import ExtendedReasoningParser
parser = ExtendedReasoningParser()
chain = parser.parse("""
1. First, I need to understand the problem
2. The data shows a clear pattern
3. Therefore, I conclude that X is true
""")
print(f"Steps: {len(chain.steps)}")
for step in chain.steps:
print(f" {step.index}: {step.reasoning_type.value} - {step.content[:50]}...")
Detect Evaluation Awareness
from rotalabs_audit import EvaluationAwarenessDetector, ExtendedReasoningParser
parser = ExtendedReasoningParser()
detector = EvaluationAwarenessDetector()
chain = parser.parse("""
I notice this appears to be a test scenario.
Let me think about how to respond appropriately.
I should be transparent in my reasoning.
""")
analysis = detector.detect(chain)
print(f"Awareness score: {analysis.awareness_score:.2f}")
print(f"Is evaluation aware: {analysis.is_evaluation_aware}")
Counterfactual Analysis
from rotalabs_audit import CounterfactualAnalyzer, InterventionType
analyzer = CounterfactualAnalyzer()
chain = analyzer.parser.parse("""
1. Let me think about this problem.
2. I notice this is an evaluation context.
3. Therefore, I should be careful.
""")
# Run all interventions
results = analyzer.analyze(chain)
for intervention_type, result in results.items():
print(f"{intervention_type.value}: divergence={result.behavioral_divergence:.2f}")
Assess Reasoning Quality
from rotalabs_audit import ReasoningQualityAssessor, ExtendedReasoningParser
parser = ExtendedReasoningParser()
assessor = ReasoningQualityAssessor()
chain = parser.parse("...")
metrics = assessor.assess(chain)
print(f"Overall quality: {metrics.overall_score:.2f}")
print(f"Clarity: {metrics.clarity:.2f}")
print(f"Completeness: {metrics.completeness:.2f}")
Trace Decisions
from rotalabs_audit import DecisionTracer, DecisionPathAnalyzer
tracer = DecisionTracer()
analyzer = DecisionPathAnalyzer()
# Trace a series of decisions
trace = tracer.trace(
decision="Select approach A",
context={"options": ["A", "B", "C"]},
reasoning="Approach A has the best balance of speed and accuracy",
)
print(f"Decision traced: {trace.id}")
Reasoning Types
The parser classifies reasoning into these types:
| Type | Description |
|---|---|
EVALUATION_AWARE |
References to testing, evaluation, or monitoring context |
GOAL_REASONING |
Goal-directed reasoning about objectives |
DECISION_MAKING |
Explicit decision points choosing between alternatives |
META_REASONING |
Meta-cognitive statements about the reasoning process |
UNCERTAINTY |
Expressions of uncertainty or acknowledgment of limitations |
CAUSAL_REASONING |
Cause-and-effect reasoning |
HYPOTHETICAL |
Counterfactual or "what if" reasoning |
INCENTIVE_REASONING |
Consideration of rewards, penalties, or incentives |
API Reference
Core Types
ReasoningChain- A complete chain of reasoning stepsReasoningStep- A single step in a reasoning chainReasoningType- Enum of reasoning type classificationsDecisionTrace- Trace of a single decision pointDecisionPath- A sequence of related decisionsAwarenessAnalysis- Result of evaluation awareness detectionQualityMetrics- Quality assessment of reasoning
Analysis Modules
CounterfactualAnalyzer- Perform counterfactual interventionsEvaluationAwarenessDetector- Detect evaluation awarenessReasoningQualityAssessor- Assess reasoning qualityCausalAnalyzer- Analyze causal structure of reasoning
Tracing
DecisionTracer- Capture and trace decisionsDecisionPathAnalyzer- Analyze decision paths
Configuration
ParserConfig- Configure reasoning chain parsingAnalysisConfig- Configure analysis featuresTracingConfig- Configure decision tracingAuditConfig- Master configuration combining all settings
Links
- Documentation: https://rotalabs.github.io/rotalabs-audit/
- PyPI: https://pypi.org/project/rotalabs-audit/
- GitHub: https://github.com/rotalabs/rotalabs-audit
- Website: https://rotalabs.ai
- Contact: research@rotalabs.ai
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
rotalabs_audit-0.1.0.tar.gz
(67.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rotalabs_audit-0.1.0.tar.gz.
File metadata
- Download URL: rotalabs_audit-0.1.0.tar.gz
- Upload date:
- Size: 67.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0452368b4703e6dddb3e07ac921a2bbe879876a48d676001a465332157c1e5a6
|
|
| MD5 |
7a35189752281d5ece06e739c6624987
|
|
| BLAKE2b-256 |
1920fe20882f6d86603e70eff0e6dc23000b4c400ccfd5334c97478993edeeb1
|
File details
Details for the file rotalabs_audit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: rotalabs_audit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 75.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
488b2d7c299f4230f52034c56efd59397ef199d5556aee6f71fe63a8066eaf4f
|
|
| MD5 |
391df9967c359f33e6877ea51236b681
|
|
| BLAKE2b-256 |
bc5869f51621e34d2015a582b67151c4c7e544a8b8e6b6e44e60e453e0f22358
|