Causal inference engine for deep learning training dynamics

These details have not been verified by PyPI

Project description

NeuralDBG

A causal inference engine for deep learning training that provides structured explanations of neural network training failures. Understand why your model failed during training through semantic analysis and abductive reasoning, not raw tensor inspection.

Overview

NeuralDBG treats training as a semantic trace of learning dynamics rather than a black box. It extracts meaningful events and provides causal hypotheses about training failures, enabling researchers to:

Identify gradient health transitions (stable -> vanishing/saturated)
Detect activation regime shifts (normal -> saturated/dead)
Detect optimizer instability (loss plateaus, spikes, divergence)
Catch data anomalies (NaN, Inf, distribution shifts)
Track propagation of instabilities through network layers
Generate ranked causal explanations for training failures

Unlike traditional monitoring tools (TensorBoard, Weights & Biases), NeuralDBG focuses on causal inference rather than metric tracking.

Key Features

Semantic Event Extraction: Detects meaningful transitions in training dynamics
Causal Compression: Identifies first occurrences and propagation patterns
Post-Mortem Reasoning: Provides ranked hypotheses about failure causes
Optimizer Instability Detection: Tracks loss plateaus, spikes, and divergence
Data Anomaly Detection: Catches NaN, Inf, and distribution shifts in inputs
Event Collapsing: Merges sequential events into summary traces
Compiler-Aware: Operates at module boundaries to survive torch.compile
Non-Invasive: Wraps existing PyTorch training loops without code changes
Minimal API: Focused on explanations, not raw data dumps

Quick Start

Installation

pip install neuraldbg

Contributor Onboarding

For a new collaborator, run:

make bootstrap

This one-command setup:

verifies or recreates .venv
installs runtime, development, and MLflow/MLOps dependencies
activates the repository git hooks
installs the project in editable mode

Then activate the environment:

source .venv/bin/activate

Validation sync is intentionally opt-in because it depends on VALIDATION_BUNDLE_TOKEN and rewrites protected local files:

bash scripts/bootstrap.sh --with-validation-sync

Docker Development (Hermetic Workspace)

Use Docker to keep a reproducible local environment across machines and contributors.

# Build image
docker-compose build

# Start the dev container (one-command startup)
docker-compose up -d

# Open a shell in the running workspace
docker-compose exec neuraldbg-dev bash

Equivalent shortcuts via Makefile:

make build
make up
make shell

Run tests inside Docker:

docker-compose run --rm neuraldbg-dev bash -lc "pytest"

Or:

make test-docker

Persistent volumes are mounted to:

/data (host: ./data)
/models (host: ./models)
/outputs (host: ./outputs)

Stop containers:

docker-compose down

Basic Usage

import torch
import torch.nn as nn
from neuraldbg import NeuralDbg

# Your existing model and training setup
model = nn.Sequential(nn.Linear(10, 5), nn.ReLU(), nn.Linear(5, 1))
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
criterion = nn.MSELoss()

# Wrap your training loop
with NeuralDbg(model) as dbg:
    for step, (inputs, targets) in enumerate(dataloader):
        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()

        # Events are extracted automatically

# After training failure, query for explanations
explanations = dbg.explain_failure()
print(explanations[0])  # "Gradient vanishing originated in layer 'linear1' at step 234, likely due to LR × activation mismatch (confidence: 0.87)"

Inference API

# Get ranked causal hypotheses for the failure
hypotheses = dbg.get_causal_hypotheses()

# Query specific causal chains
chain = dbg.trace_causal_chain('vanishing_gradients')

# Check for coupled failures
couplings = dbg.detect_coupled_failures()

Optimizer Instability Detection

with NeuralDbg(model) as dbg:
    for step in range(num_steps):
        dbg.step = step
        output = model(inputs)
        loss = criterion(output, targets)
        loss.backward()

        # Feed loss values for optimizer instability detection
        dbg.record_loss(loss.item())

        optimizer.step()

# Detect loss plateaus, spikes, or divergence
hypotheses = dbg.explain_failure("optimizer_instability")
for h in hypotheses:
    print(h.description)  # "Loss spike detected at step 50..."

Data Anomaly Detection

Data anomalies (NaN, Inf, distribution shifts) are detected automatically from layer inputs during the forward pass -- no extra API call needed:

with NeuralDbg(model) as dbg:
    # ... training loop ...
    pass

# Check for data issues
hypotheses = dbg.explain_failure("data_anomaly")
for h in hypotheses:
    print(h.description)  # "NaN values detected in input to layer 'linear1'..."

Event Collapsing

Compress sequential events in the same layer into summary traces:

# Get compressed event timeline
collapsed = dbg._collapse_events()
print(f"{len(dbg.events)} raw events -> {len(collapsed)} collapsed")

Architecture

Core Components

Semantic Event Extractor: Detects meaningful transitions in learning dynamics
Causal Compressor: Identifies patterns and propagation in training failures
Post-Mortem Reasoner: Generates ranked hypotheses about failure causes
Compiler-Aware Monitor: Operates at safe boundaries for optimization compatibility

Event Types

Event Type	Source	Detects
`gradient_health_transition`	Backward hooks	Vanishing, exploding, saturated gradients
`activation_regime_shift`	Forward hooks	Dead neurons, saturated activations
`optimizer_instability`	`record_loss()`	Loss plateaus, spikes, divergence
`data_anomaly`	Forward hooks (inputs)	NaN, Inf, distribution shifts

Event Structure

Each semantic event represents:

Transition type (gradient_health, activation_regime, optimizer_instability, data_anomaly)
Layer/parameter identifier
Step range of occurrence
Confidence score
Causal metadata (propagation patterns, coupled failures)

Target Users

ML Researchers seeking causal explanations for training failures
PhD Students analyzing learning dynamics in novel architectures
Research Engineers understanding optimization instabilities

Not intended for production monitoring, metric tracking, or no-code users.

Supported Failure Types

vanishing_gradients -- Root cause + saturation coupling
exploding_gradients -- First layer to explode
dead_neurons -- Neuron death in activation layers
saturated_activations -- Activation saturation patterns
optimizer_instability -- Loss plateaus, spikes, divergence (with gradient cross-reference)
data_anomaly -- NaN/Inf/distribution shift in inputs

Limitations (MVP Scope)

PyTorch only
Focus on semantic events, not tensor inspection
Command-line interface only
Compiler-aware (torch.compile compatible)

Contributing

This is an MVP focused on proving the concept of causal inference for training dynamics. Contributions should align with the core mission of providing structured explanations for training failures.

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass
Submit a pull request

License

MIT License - see LICENSE.md for details.

Documentation

CHANGELOG.md - Version history and notable changes
logic_graph.md - System architecture and data flow

Citation

If you use NeuralDBG in your research, please cite:

@misc{neuraldbg2025,
  title={NeuralDBG: A Causal Inference Engine for Deep Learning Training Dynamics},
  author={SENOUVO Jacques-Charles Gad},
  year={2025},
  url={https://github.com/Lemniscate-world/Neural}
}

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.3.0

May 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuraldbg-1.3.0.tar.gz (20.1 kB view details)

Uploaded May 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

neuraldbg-1.3.0-py3-none-any.whl (16.8 kB view details)

Uploaded May 17, 2026 Python 3

File details

Details for the file neuraldbg-1.3.0.tar.gz.

File metadata

Download URL: neuraldbg-1.3.0.tar.gz
Upload date: May 17, 2026
Size: 20.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for neuraldbg-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`a7b030572b146e5b455117c6bb5d2d7575b94a93c729bb0eeaecb13bb7d1e367`
MD5	`2940b5ec20989e75c42531b348711eca`
BLAKE2b-256	`b2f8acd15de1f2a5204b93045572ae44ca9fed680fdf0a36b0747dc8a065dc21`

See more details on using hashes here.

Provenance

The following attestation bundles were made for neuraldbg-1.3.0.tar.gz:

Publisher: publish.yml on LambdaSection/NeuralDBG

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: neuraldbg-1.3.0.tar.gz
- Subject digest: a7b030572b146e5b455117c6bb5d2d7575b94a93c729bb0eeaecb13bb7d1e367
- Sigstore transparency entry: 1560495368
- Sigstore integration time: May 17, 2026
Source repository:
- Permalink: LambdaSection/NeuralDBG@763bcbcdd20f44baa98152535e56dedabd8882b7
- Branch / Tag: refs/tags/v1.3.0
- Owner: https://github.com/LambdaSection
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@763bcbcdd20f44baa98152535e56dedabd8882b7
- Trigger Event: release

File details

Details for the file neuraldbg-1.3.0-py3-none-any.whl.

File metadata

Download URL: neuraldbg-1.3.0-py3-none-any.whl
Upload date: May 17, 2026
Size: 16.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for neuraldbg-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b9311594349e277c6546c8745efa1037224223d5699c73c2304770a725647af2`
MD5	`4e45cbd06b67bc33b9b3a88224840e7f`
BLAKE2b-256	`ab776e32fbbe316c28d25dc856205fbf3799882144c0f425cbfb326e9c5b9edf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for neuraldbg-1.3.0-py3-none-any.whl:

Publisher: publish.yml on LambdaSection/NeuralDBG

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: neuraldbg-1.3.0-py3-none-any.whl
- Subject digest: b9311594349e277c6546c8745efa1037224223d5699c73c2304770a725647af2
- Sigstore transparency entry: 1560495504
- Sigstore integration time: May 17, 2026
Source repository:
- Permalink: LambdaSection/NeuralDBG@763bcbcdd20f44baa98152535e56dedabd8882b7
- Branch / Tag: refs/tags/v1.3.0
- Owner: https://github.com/LambdaSection
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@763bcbcdd20f44baa98152535e56dedabd8882b7
- Trigger Event: release

neuraldbg 1.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

NeuralDBG

Overview

Key Features

Quick Start

Installation

Contributor Onboarding

Docker Development (Hermetic Workspace)

Basic Usage

Inference API

Optimizer Instability Detection

Data Anomaly Detection

Event Collapsing

Architecture

Core Components

Event Types

Event Structure

Target Users

Supported Failure Types

Limitations (MVP Scope)

Contributing

License

Documentation

Citation

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance