Skip to main content

Spectral diagnostics for trust in LLMs

Project description

Spectral Trust Framework

A Graph Signal Processing (GSP) framework for measuring the trustworthiness of LLM internal representations.

spectral_trust constructs dynamic graphs from attention patterns and applies spectral analysis (eigenvalues, Dirichlet energy) to detect hallucinations, quantify uncertainty, and map the "smoothness" of reasoning flows.

What is it?

By treating the transformer's attention mechanism as a graph and the hidden states as signals on that graph, we can calculate rigorous mathematical metrics:

  • Dirichlet Energy: How much the signal varies across connected tokens (proxy for conflict/uncertainty).
  • Smoothness Index: Normalized energy indicating how well the representation aligns with the attention structure.
  • Fiedler Value: Algebraic connectivity of the attention graph.
  • HFER (High-Frequency Energy Ratio): Energy concentration in high-frequency spectral components.

Features

  • Plug-and-Play: Works out-of-the-box with Llama-3, Mistral, Qwen, Gemma, and Phi.
  • Offline Ready: --offline mode to use cached models without internet access.
  • Spectral Metrics: Automatically computes Energy, Entropy, Fiedler Value, HFER, and Smoothness.
  • Robustness Tools: Includes hooks for head ablation and residual patching.

Structure

  • src/spectral_trust/: Core package source code.
  • notebooks/: Jupyter notebooks for demonstration.
  • examples/: Minimal example scripts.
  • dist/: Wheel and source distributions.

Installation

pip install spectral_trust
# OR install from source
pip install -e .

Usage

CLI Power Tool

Analyze a sentence (uses cuda if available):

gsp-cli analyze --text "The capital of France is Paris." --model llama-3.1-8b

Offline Mode (no internet required):

gsp-cli analyze --text "Refactoring is fun." --model llama-3.2-1b --offline

Python API

from spectral_trust import GSPDiagnosticsFramework, GSPConfig

config = GSPConfig(model_name="llama-3.2-1b", device="cuda", local_files_only=True)
with GSPDiagnosticsFramework(config) as framework:
    framework.instrumenter.load_model("meta-llama/Llama-3.2-1B")
    results = framework.analyze_text("The capital of France is Paris.")
    
    print(f"Smoothness: {results['layer_diagnostics'][-1].smoothness_index:.4f}")

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spectral_trust-0.1.0.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spectral_trust-0.1.0-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file spectral_trust-0.1.0.tar.gz.

File metadata

  • Download URL: spectral_trust-0.1.0.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for spectral_trust-0.1.0.tar.gz
Algorithm Hash digest
SHA256 16d54ae405fcdde2091ac4880571e14699dd856af453a9f5d9ca5d61605ec5d2
MD5 b73be2f2cc67c473d5c7d323a9004a67
BLAKE2b-256 a10421ed7cc46e912d72c004f8ff75b06b071a2cd09f5d5c684e7b8f1fe4dbfe

See more details on using hashes here.

File details

Details for the file spectral_trust-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: spectral_trust-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for spectral_trust-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c9d7ee9e77b9870e4d58157c7937ab84b7c13373832cb124729bb3d1fda6df8e
MD5 90cb3645003fd2974e89e01b2c58ffd7
BLAKE2b-256 4eed466e689aa8f1401372a1de2999a822b8f5f51b303bf401c97b4bd05fe503

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page