Skip to main content

SAGE - Safety Analysis and Guidance Engine for Nuclear Criticality Safety

Project description

SAGE Logo

SAGE

Safety Analysis & Guidance Engine

A reasoning language model platform for safety-critical technical domains.

Status: Phase 1 (Foundation) complete | Phase 2 (Core Development) planned Version: 0.1.0 | Python: 3.11-3.12 | License: MIT

Initial Focus: SAGE-NCS

Nuclear Criticality Safety (NCS) is the initial domain, providing:

  • Knowledge queries (standards, regulations, historical accidents)
  • Document drafting assistance (CSEs, technical basis)
  • Calculation support (SCALE/MCNP setup, interpretation)
  • Double contingency analysis
  • Training for new NCS engineers

What's Built

Safety Core (Production-Ready)

The safety-critical infrastructure is fully implemented with 100% test coverage:

  • Abstention system - 6 triggers for refusing to answer (out-of-domain, low confidence, conflicting sources, calculation limits, dual-use concerns, ambiguous safety)
  • Escalation workflow - 5 triggers routing to human experts (novel configurations, safety basis changes, regulatory implications, low confidence + high stakes, conflicting reasoning)
  • Output classification - GREEN/YELLOW/RED safety levels with integrated validation
  • Reasoning verification - Chain-of-thought validation, citation injection, uncertainty quantification

Data Pipeline

  • Ingestion: PDF (with table preservation), OCR (Tesseract), Word, HTML extraction
  • Sources: NRC ADAMS and OSTI.gov API clients, ANSI/ANS-8 standards catalog
  • Processing: Intelligent chunking preserving tables/equations/section hierarchy, metadata extraction, quality validation
  • Storage: Qdrant vector store with hybrid search (dense + sparse via Reciprocal Rank Fusion), PostgreSQL for metadata
  • Embeddings: OpenAI text-embedding-3-small (primary), BGE-large-en-v1.5 (local alternative)

RAG Pipeline

  • Full retrieval-augmented generation with configurable 30+ options
  • Context building, response generation, citation verification, grounding score calculation
  • NCS-specific prompt templates with conservative bias enforcement
  • Multi-provider LLM support (Claude, GPT-4)

NCS Tools

  • K-eff estimator - Hand-method screening calculations, ANSI/ANS-8.1 single-parameter limits, surface density method
  • SCALE interface - Input file generation and output parsing (requires external SCALE installation)
  • Standards lookup - ANSI/ANS-8 series queries with version tracking
  • Geometry visualizer - 3D visualization of fissile configurations
  • Unit converter - Mass, volume, concentration, enrichment conversions

Evaluation & Benchmarks

  • Benchmark runner with ARH-600, ICSBEP (5000+ experiments), CSE review, and red team adversarial suites
  • Evaluation metrics: accuracy, precision, recall, F1, citation quality, conservative bias
  • Current baseline: 100% accuracy on initial test suite (19/19 questions)

Training Data Collection

  • Async interaction logger for all SAGE sessions (JSONL/CSV export)
  • Expert feedback collection (correctness, citations, clarity)

Infrastructure

  • CI/CD: GitHub Actions pipelines for lint, type-check, test, security scanning, and release
  • Docker: Multi-stage production image (Alpine), dev image, docker-compose for local dev (PostgreSQL + Qdrant + MLflow)
  • Monitoring: Structured JSON logging, audit trail, production metrics tracking
  • Experiment tracking: MLflow integration

Architecture

Query → Router → [KNOWLEDGE | CALCULATION | ANALYSIS | REASONING]
                        ↓
               RAG Pipeline + Tools
                        ↓
         Reasoning Verification & Validation
                        ↓
         Classification (GREEN / YELLOW / RED)
              ↓              ↓           ↓
           Return      Caution       Escalate to
           answer      + caveats     human expert

Dual-database design: PostgreSQL (metadata, escalation queue, interaction logs) + Qdrant (vectors, chunked documents)

Multi-provider LLM: Anthropic Claude (primary), OpenAI GPT-4 (fallback), pluggable interface

Quick Start

# Start services (PostgreSQL, Qdrant, app)
docker-compose up -d

# Download public NCS documents
python scripts/download_public_docs.py

# Run tests
pytest tests/ -v --cov=src/sage --cov-fail-under=80

# Run benchmark validation
python scripts/run_benchmark_validation.py

# Generate decision report
python scripts/generate_decision_report.py

See examples/sage_config.yaml for full configuration options.

Testing

674 tests across unit, integration, system, acceptance, and benchmark categories.

Category Coverage Requirement
General 80% minimum (enforced in CI)
Safety-critical modules 100%

Tests run on Python 3.11 and 3.12 with parallel execution (pytest-xdist). See docs/testing/ for test guides and conventions.

Planned Work

Phase 2: Core Development

  • Continued pre-training on 10B+ token NCS corpus
  • Supervised fine-tuning (5000+ expert Q&A pairs)
  • RLHF with domain expert preference data
  • Constitutional AI safety training
  • Full SCALE/MCNP tool integration
  • Advanced recursive reasoning (self-verification, iterative refinement)

Phase 3-5: Validation, Pilot, Production

  • Expert blind evaluation and red team exercises
  • NRC/DOE regulatory compliance review
  • Pilot site deployment
  • Multi-domain expansion

Future Domains

Module Domain
SAGE-RP Radiation Protection
SAGE-PSA Probabilistic Safety Assessment
SAGE-Fire Fire Protection Engineering
SAGE-Trans Transportation of Radioactive Materials
SAGE-Decom Decommissioning & Waste Management

Documentation

Core Principles

  1. Conservative by design - Always err on the side of safety
  2. Human-in-the-loop - Augment engineers, don't replace judgment
  3. Traceable reasoning - Every conclusion backed by citations
  4. Tool-augmented - Calculations via verified tools, not LLM arithmetic
  5. Auditable - Full reasoning chains for regulatory compliance

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sage_ncs-0.1.0.tar.gz (179.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sage_ncs-0.1.0-py3-none-any.whl (224.8 kB view details)

Uploaded Python 3

File details

Details for the file sage_ncs-0.1.0.tar.gz.

File metadata

  • Download URL: sage_ncs-0.1.0.tar.gz
  • Upload date:
  • Size: 179.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sage_ncs-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fa726e3101cab100bfbcc0d0e67eccf63546203bf3d0cd9dcc1f78e4af590825
MD5 8dccf5f60e8419eec5c75fedde2aca6c
BLAKE2b-256 157acf05fefa408bc37e4092d0d2a247263218b9d7b789968dce6280b33e3e47

See more details on using hashes here.

File details

Details for the file sage_ncs-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sage_ncs-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 224.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sage_ncs-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0f322a4916e603169288b418951d111134a79274ba0015c36e054a235ef9c54c
MD5 f4490bb34f25a8d1567d158434a25cb0
BLAKE2b-256 d95a4fbee71442fa7bfe526671d6a818475cb384f2694ccc1d23104a18593878

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page