NthLayer - The Missing Layer of Reliability
Project description
NthLayer
Reliability as code. Pure compiler.
Define reliability requirements in a manifest. Generate dashboards, alerts, SLOs, and documentation โ deterministically, every time.
TL;DR
pip install nthlayer
nthlayer init
nthlayer apply service.yaml
โ ๏ธ The Problem
Reliability decisions happen too late. Teams set SLOs in isolation, deploy without checking error budgets, and discover missing metrics during incidents. Dashboards are inconsistent. Alerts are copy-pasted. Nobody validates whether a 99.99% target is even achievable given dependencies.
๐ก The Solution
NthLayer is a pure compiler for reliability infrastructure. Write a manifest, get artifacts:
service.yaml โ validate โ apply
โ โ
โ โโโ Grafana dashboards, Prometheus alerts,
โ recording rules, SLOs, PagerDuty config,
โ Backstage entities, service docs
โ
โโโ SLO feasible? Dependencies support it? Metrics exist?
Policies pass? Ceiling valid?
NthLayer generates. The nthlayer-workers runtime (Tier 2) enforces, observes, and responds at runtime, with state held in nthlayer-core (Tier 1) and operator interaction via nthlayer-bench (Tier 3).
โก Core Features
Artifact Generation
Generate dashboards, alerts, and SLOs from a single spec.
$ nthlayer apply service.yaml
Generated:
โ dashboard.json (Grafana)
โ alerts.yaml (Prometheus)
โ recording-rules.yaml (Prometheus)
โ slos.yaml (OpenSLO)
โ backstage.json (Backstage entity)
Dependency-Aware SLO Validation
Your SLO ceiling is your weakest dependency chain. NthLayer calculates it.
$ nthlayer validate-slo payment-api
Target: 99.99% availability
Dependencies:
โ postgresql (99.95%)
โ redis (99.99%)
โ user-service (99.9%)
Serial availability: 99.84%
โ INFEASIBLE: Target exceeds dependency ceiling by 0.15%
Recommendation: Reduce target to 99.8% or improve user-service SLO
Metric Recommendations
Enforce OpenTelemetry conventions. Know what's missing before production.
$ nthlayer recommend-metrics payment-api
Required (SLO-critical):
โ http.server.request.duration FOUND
โ http.server.active_requests MISSING
Run with --show-code for instrumentation examples.
Monte Carlo SLO Simulation
Model failure scenarios before they happen.
$ nthlayer simulate service.yaml --scenarios 10000
Monte Carlo Simulation (10,000 runs)
SLO: availability โฅ 99.9%
Result: 94.2% of scenarios meet target
P50 availability: 99.95%
P99 availability: 99.82%
Risk: 5.8% chance of SLO breach in 30d window
Topology Export
Export dependency graphs for correlation engines.
$ nthlayer topology export service.yaml --format json
$ nthlayer topology export service.yaml --format mermaid
$ nthlayer topology export service.yaml --format dot
Policy Validation
Enforce organizational standards at build time.
$ nthlayer validate service.yaml --policies policies.yaml
โ required_fields: ownership.runbook present
โ tier_constraint: critical services require deployment gates
โ dependency_rule: all critical deps have SLOs
๐ Quick Start
# Install
pip install nthlayer
# Create a service spec
nthlayer init
# Validate and generate
nthlayer apply service.yaml
Minimal service.yaml
name: payment-api
tier: critical
type: api
team: payments
dependencies:
- postgresql
- redis
NthLayer also supports the OpenSRM format (apiVersion: opensrm/v1) for contracts, deployment gates, and more. See full spec reference for all options.
๐ CI/CD Integration
# GitHub Actions
- name: Validate reliability
run: |
nthlayer validate service.yaml
nthlayer validate-slo service.yaml
nthlayer apply service.yaml --output-dir generated/
For runtime enforcement (deployment gates, drift detection, error budget checks), use nthlayer-workers โ the runtime tier:
- name: Gate deployment
run: |
nthlayer-workers gate --service payment-api
The runtime tier reads SLOs and dependency declarations from the same OpenSRM manifests this generator consumes. Verdicts and assessments flow through nthlayer-core's HTTP API.
Works with: GitHub Actions, GitLab CI, ArgoCD, Tekton, Jenkins
๐ฏ How It's Different
| Traditional Approach | NthLayer |
|---|---|
| Set SLOs in isolation | Validate against dependency chains |
| Manual dashboard creation | Generate from spec |
| Copy-paste alerts | 593+ alert templates, auto-selected |
| Discover missing metrics in incidents | Enforce before deployment |
| "Is this ready?" = opinion | "Is this ready?" = deterministic check |
๐ Documentation
Full Documentation - Comprehensive guides and reference.
| Guide | Description |
|---|---|
| Quick Start | Get running in 5 minutes |
| Dependency Discovery | Automatic dependency mapping |
| CI/CD Integration | Pipeline setup |
| CLI Reference | All commands |
๐บ๏ธ Roadmap
Generate (this repo)
- Artifact generation (dashboards, alerts, SLOs, recording rules, Loki alerts)
- Dependency-aware SLO validation
- Metric recommendations (OpenTelemetry conventions)
- Monte Carlo SLO simulation
- Policy validation (build-time)
- Topology export (JSON, Mermaid, DOT)
- OpenSRM manifest format (
opensrm/v1) - Identity resolution & ownership
- Backstage entity generation
- Service documentation generation
- CI/CD GitHub Action
- Agentic inference (
nthlayer infer) - MCP server integration
- Backstage plugin
Runtime tier (nthlayer-workers)
What was previously the standalone nthlayer-observe repo plus four agentic components is now consolidated into a single Tier-2 worker process with five modules:
- observe โ SLO collection, drift detection, dependency/topology discovery, deploy gate
- measure โ judgment SLO evaluation, governance ratchet, autonomy-level reduction
- correlate โ session-window event correlation, topology drift, contract divergence
- respond โ incident response coordinator (situation-shaped triggers, capture-at-write-time escalation)
- learn โ outcome resolution, calibration signals, retrospective generation
Backed by nthlayer-core (Tier 1: HTTP API, verdict store, case management, manifest catalogue) and operated via nthlayer-bench (Tier 3: Textual TUI for SREs).
Agentic Inference (Planned)
nthlayer infer will use a model to analyse a codebase and propose an OpenSRM manifest for it. The model examines the code, identifies services, infers appropriate SLO targets, and generates a draft service.reliability.yaml that NthLayer then validates and generates artifacts from.
This follows the Zero Framework Cognition boundary applied across the OpenSRM ecosystem: the model provides judgment (what SLOs does this service need?), and NthLayer provides transport (validate the manifest, generate the monitoring artifacts). Clean boundary between reasoning and deterministic transformation. Architectural context: opensrm/docs/superpowers/.
OpenSRM Ecosystem
NthLayer is one piece of a six-repo ecosystem. The architecture has three runtime tiers; this repo (nthlayer-generate) sits outside the runtime tiers as a build-time compiler, feeding manifests forward.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OpenSRM Manifest โ
โ (the shared contract) โ
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโ
โผ โผ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ nthlayer-generateโ โ nthlayer-core โ
โ (build-time) โ โ (Tier 1) โ
โ โ โ HTTP API ยท โ
โ specs โ Grafana, โ โ verdict store ยท โ
โ Prometheus, SLOs,โ โ case mgmt ยท โ
โ Backstage, docs โ โ manifests โ
โโโโโโโโโโฌโโโโโโโโโโ โโโโโโโโโโฒโโโโโโโโโ
โ โ HTTP only
โ deployed โโโโโโโโโโดโโโโโโโโโโโโโโโ
โผ โ โ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโดโโโโโโโโโ โโโโโโโโโโโดโโโโโโโโโ
โ Live infra โ โ nthlayer-workersโ โ nthlayer-bench โ
โ (Prometheus, โ obs โ (Tier 2) โ โ (Tier 3) โ
โ Grafana, etc.) โ โโโโโโ โ โ Textual TUI for โ
โโโโโโโโโโโโโโโโโโโโ โ observeยทmeasure โ โ SREs: situation โ
โ correlateยทrespondโ โ board, case โ
โ ยทlearn โ โ bench, approvals โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
Learning loop:
workers.learn retrospectives โ manifest updates โ nthlayer-generate
regenerates โ workers refine thresholds โ operators ratify in bench
How nthlayer-generate fits in:
- Reads OpenSRM manifests and emits the monitoring infrastructure (Prometheus rules, Grafana dashboards, recording rules, Backstage entities, service docs) that the runtime tier and live observability stack rely on
- Pure compiler โ deterministic, stateless, no LLM, no runtime side effects
- Verdicts and assessments produced by
nthlayer-workersmodules emit OTel side-effects (gen_ai.decision.*,gen_ai.override.*) that flow into Prometheus; this generator can be configured to produce dashboards for those metrics alongside service dashboards - Exports service topology that
workers.correlateuses for topology-aware signal correlation - Post-incident retrospectives produced by
workers.learnfeed back into manifest updates that regenerate via this compiler โ closing the loop
Each component works alone. Someone who just needs reliability-as-code adopts nthlayer-generate without needing the rest of the ecosystem.
| Repo | Role |
|---|---|
opensrm |
The OpenSRM specification โ the manifest format and language for declaring reliability |
nthlayer |
Project front door โ documentation hub, GitHub Action delegating to this repo, docs site |
nthlayer-common |
Shared library: verdict model, manifest parser, LLM wrapper, error hierarchy, CoreAPIClient |
nthlayer-generate |
The deterministic compiler (this repo) โ specs to artefacts |
nthlayer-core |
Tier 1 โ HTTP API server, verdict store, case management, manifest catalogue (pip install nthlayer) |
nthlayer-workers |
Tier 2 โ five worker modules: observe, measure, correlate, respond, learn |
nthlayer-bench |
Tier 3 โ Textual TUI for SREs |
๐ค Contributing
# Install uv (https://docs.astral.sh/uv/)
curl -LsSf https://astral.sh/uv/install.sh | sh
git clone https://github.com/rsionnach/nthlayer-generate.git
cd nthlayer-generate
make setup # Install deps, start services
make test # Run tests
See CONTRIBUTING.md for details.
๐ License
MIT - See LICENSE.txt
๐ Acknowledgments
Built on grafana-foundation-sdk, awesome-prometheus-alerts, pint, and OpenSLO. Inspired by Sloth and autograf.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nthlayer_generate-1.0.0.tar.gz.
File metadata
- Download URL: nthlayer_generate-1.0.0.tar.gz
- Upload date:
- Size: 541.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f8f34a182b92b8f24dc41449e2cf8c73e2c8b83fe1d290382164538667b7f41
|
|
| MD5 |
3d74773b8178e10a370d9d3181d3300c
|
|
| BLAKE2b-256 |
4030d77421fa413add0452f809fa39acb35ad2fa3d2f9d2eeeead3dcc150c780
|
Provenance
The following attestation bundles were made for nthlayer_generate-1.0.0.tar.gz:
Publisher:
release.yml on rsionnach/nthlayer-generate
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nthlayer_generate-1.0.0.tar.gz -
Subject digest:
1f8f34a182b92b8f24dc41449e2cf8c73e2c8b83fe1d290382164538667b7f41 - Sigstore transparency entry: 1396731742
- Sigstore integration time:
-
Permalink:
rsionnach/nthlayer-generate@d2adc78fcc35781ee314c644c9ecccaf7fe2ee60 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/rsionnach
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d2adc78fcc35781ee314c644c9ecccaf7fe2ee60 -
Trigger Event:
release
-
Statement type:
File details
Details for the file nthlayer_generate-1.0.0-py3-none-any.whl.
File metadata
- Download URL: nthlayer_generate-1.0.0-py3-none-any.whl
- Upload date:
- Size: 418.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49242e506536b07953ce9faeb447400de9cce65a000cc766fdd752f2e124f556
|
|
| MD5 |
3ba536f73917942c986ba4580f20ff78
|
|
| BLAKE2b-256 |
dd5598304a09bdfeec69f67e957b2be5110a01ef4b05e57592960afef9444b61
|
Provenance
The following attestation bundles were made for nthlayer_generate-1.0.0-py3-none-any.whl:
Publisher:
release.yml on rsionnach/nthlayer-generate
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nthlayer_generate-1.0.0-py3-none-any.whl -
Subject digest:
49242e506536b07953ce9faeb447400de9cce65a000cc766fdd752f2e124f556 - Sigstore transparency entry: 1396731756
- Sigstore integration time:
-
Permalink:
rsionnach/nthlayer-generate@d2adc78fcc35781ee314c644c9ecccaf7fe2ee60 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/rsionnach
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d2adc78fcc35781ee314c644c9ecccaf7fe2ee60 -
Trigger Event:
release
-
Statement type: