Kubernetes-style health probes and metrics for Plone
Project description
plone.observability
Kubernetes-style health probes and pluggable metrics for Plone.
Features
- Liveness, readiness, and startup probes on a separate HTTP port
- Pluggable metrics endpoint (
@@metrics) with Prometheus and JSON output - Extensible via ZCA: custom health checks, metric providers, and formatters
Installation
Add plone.observability to your package dependencies:
[project]
dependencies = [
"plone.observability",
]
Then include it in your ZCML:
<include package="plone.observability" />
The package registers itself and starts the health server automatically when Zope starts via a IProcessStarting subscriber.
Configuration
All configuration is done via environment variables.
| Variable | Default | Description |
|---|---|---|
PLONE_OBSERVABILITY_HEALTH_HOST |
0.0.0.0 |
Bind address for the health probe server |
PLONE_OBSERVABILITY_HEALTH_PORT |
8081 |
Port for the health probe server. Set to 0 to disable. |
PLONE_OBSERVABILITY_METRICS_ALLOWLIST |
(empty, open) | Comma-separated CIDRs allowed to access @@metrics. Empty means all IPs are allowed. |
PLONE_OBSERVABILITY_TRUSTED_PROXIES |
127.0.0.1,::1 |
Comma-separated CIDRs of trusted reverse proxies for X-Forwarded-For resolution. |
PLONE_OBSERVABILITY_METRICS_CACHE_TTL |
60 |
Seconds to cache content catalog metrics (expensive to collect). |
PLONE_OBSERVABILITY_ZODB_ACTIVITY_MONITOR |
1 |
Install a minimal ZODB activity monitor for load/store counters. Set 0 to disable. |
Health Probes
The health server runs on a dedicated port (default 8081) in a background daemon thread, separate from the Zope WSGI server. This means it answers even when all Zope threads are busy.
The health server is started by the egg:plone.observability#healthserver WSGI filter — add it to your pipeline (see WSGI filters below). It is not started on Zope process startup, so zconsole/script runs never touch the health port.
Endpoints
| Path | Purpose |
|---|---|
/live |
Liveness check — is the process alive? |
/ready |
Readiness check — can the process serve requests? |
/startup |
Startup check — has the process finished initializing? |
All endpoints return JSON with a 200 on success or 503 on failure:
{
"status": "ok",
"checks": {
"zodb": {"ok": true, "message": "ZODB connection ok"}
}
}
Kubernetes Integration
livenessProbe:
httpGet:
path: /live
port: 8081
initialDelaySeconds: 10
periodSeconds: 30
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8081
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
startupProbe:
httpGet:
path: /startup
port: 8081
failureThreshold: 30
periodSeconds: 10
Expose the probe port alongside the main Zope port:
ports:
- name: http
containerPort: 8080
- name: health
containerPort: 8081
Metrics
The @@metrics endpoint is a browser view registered on the application root (OFS.interfaces.IApplication). It collects metrics from all registered IMetricProvider adapters and serialises them using an IMetricFormatter utility.
Accessing the endpoint
http://your-plone-host/@@metrics
http://your-plone-host/@@metrics?format=json
The default format is Prometheus text. Pass ?format=json or an Accept: application/json header to get JSON.
Built-in metrics
| Metric | Type | Scope | Description |
|---|---|---|---|
plone_uptime_seconds |
gauge | instance | Process uptime |
plone_info |
info | instance | Python, Zope, and Plone version labels |
plone_threads_active |
gauge | instance | Active Python threads |
plone_process_rss_bytes |
gauge | instance | Resident set size |
plone_process_cpu_seconds |
counter | instance | Total CPU time (user + system) |
plone_requests_total |
counter | instance | Total HTTP requests served |
plone_request_duration_seconds_sum |
counter | instance | Cumulative request duration |
plone_request_duration_seconds_bucket |
counter | instance | Request duration histogram buckets |
plone_request_duration_seconds_max |
gauge | instance | Worst-case request duration since the last scrape (the histogram cannot report the true maximum) |
plone_request_errors |
counter | instance | HTTP errors by status code |
plone_zodb_object_count |
gauge | global | Total objects in ZODB |
plone_zodb_db_size_bytes |
gauge | global | ZODB file size |
plone_zodb_connections |
gauge | instance | Open ZODB connections |
plone_zodb_cache_size |
gauge | instance | Objects in the ZODB object cache |
plone_zodb_cache_size_bytes |
gauge | instance | ZODB object cache size in bytes |
plone_zodb_loads_total |
counter | instance | Cumulative objects loaded from storage (storage-agnostic; via the ZODB activity monitor) |
plone_zodb_stores_total |
counter | instance | Cumulative objects stored to storage (storage-agnostic; via the ZODB activity monitor) |
plone_zodb_conflicts_total |
counter | instance | ZODB conflict errors during publish, by retry outcome |
plone_content_total |
gauge | global | Content objects by portal type and site |
plone_content_by_state |
gauge | global | Content objects by workflow state and site |
All plone_request* metrics additionally carry an auth="authenticated"|"anonymous"
label so traffic can be split by authentication state. (User identity is never a
metric label — only a span attribute; see the OpenTelemetry section.)
plone_request_duration_seconds_max is a per-scrape-window gauge: a histogram can
only bound latency to its bucket edges, so the true worst-case request time is
tracked directly and reset on every scrape. This gives operators the real max
backend response time alongside the histogram_quantile-derived p90/p99. Because it
resets on read, scrape it from a single Prometheus target — multiple concurrent
scrapers would each see only part of the window.
Metric scope
Metrics carry a scope label with value "global" or "instance".
- global — the value is the same across all Plone instances sharing the same ZODB (e.g. object count, content totals). When aggregating in Prometheus, avoid double-counting by filtering to a single instance.
- instance — the value is specific to this process (e.g. request counts, RSS). Sum across instances when aggregating.
ZODB load/store metrics
plone_zodb_loads_total / plone_zodb_stores_total are produced by a minimal ZODB activity monitor that plone.observability installs into the database's activity-monitor slot on the first metrics scrape. It is storage-agnostic (works on FileStorage, RelStorage, zodb-pgjsonb), cumulative, and O(1) in memory. Use rate(...) in queries — e.g. rate(plone_zodb_loads_total[5m]), or rate(plone_zodb_loads_total) / rate(plone_requests_total) as a "loads per request" smell detector.
It is installed only if no activity monitor is already configured — a pre-existing monitor is never overridden (a warning is logged and the two counters are then unavailable). Disable installation entirely with PLONE_OBSERVABILITY_ZODB_ACTIVITY_MONITOR=0.
ZODB conflict metrics
plone_zodb_conflicts_total{retry="true|false"} counts ZODB ConflictErrors raised during request publication, captured via an IPubBeforeAbort subscriber.
- A write conflict means two transactions changed the same object concurrently (write hotspots); a read conflict (
ReadConflictError) means an object a transaction required to stay current was changed under it (readCurrentinvariants, long transactions). Both are counted. retry="true"is a conflict that was retried (usually recovers and is invisible to the user);retry="false"is the final attempt that gave up.
rate(plone_zodb_conflicts_total[5m]) # overall contention
rate(plone_zodb_conflicts_total{retry="false"}[5m]) # conflicts that failed
Content metrics and catalog backends
plone_content_total / plone_content_by_state are produced from the ZCatalog index API and are therefore ZCatalog-only. On other catalog backends (e.g. plone-pgcatalog) the generic provider yields nothing; the backend package ships its own IMetricProvider with the same metric names (see Extensibility).
Prometheus scrape configuration
scrape_configs:
- job_name: plone
static_configs:
- targets: ["plone-host:8080"]
metrics_path: /@@metrics
PromQL examples
Total requests across all instances:
sum(plone_requests_total{job="plone"})
Request rate per instance (5-minute window):
rate(plone_requests_total{job="plone"}[5m])
ZODB object count (global metric — pick one instance to avoid double-counting):
plone_zodb_object_count{scope="global"} * on(instance) group_left()
(plone_info{instance=~"plone-0.*"})
Or simply query a single instance:
plone_zodb_object_count{instance="plone-0:8080", scope="global"}
Average request duration (p50 approximation from histogram):
histogram_quantile(0.5,
sum(rate(plone_request_duration_seconds_bucket[5m])) by (le, instance)
)
Memory usage per instance (MB):
plone_process_rss_bytes{job="plone"} / 1024 / 1024
WSGI Middleware for Request Metrics
The plone_requests_total and plone_request_duration_seconds_* metrics are populated by the ObservabilityMiddleware WSGI middleware. You must add it to your WSGI pipeline to get request metrics. The same applies to the OpenTelemetry root request span (see below) — both are PasteDeploy filters wired the same way.
Using cookiecutter-zope-instance (recommended)
If your zope.ini is generated by
cookiecutter-zope-instance
(3.1.0+), do not edit zope.ini by hand — declare the filters via wsgi_filters
in your instance.yaml:
default_context:
wsgi_filters:
healthserver:
use: "egg:plone.observability#healthserver"
observability:
use: "egg:plone.observability#observability"
opentelemetry:
use: "egg:plone.observability#opentelemetry"
This renders the [filter:*] sections and wires them into [pipeline:main] on
regeneration. Each entry also accepts options (extra key: value lines) and
position (outer, the default, or inner). See that project's
"Add WSGI middleware to the pipeline" how-to. healthserver starts the health
probe server; drop the opentelemetry entry if you do not use the tracing extra.
Using PasteDeploy directly (hand-written zope.ini)
[pipeline:main]
pipeline =
healthserver
egg:plone.observability#observability
...
Zope
[filter:healthserver]
use = egg:plone.observability#healthserver
[filter:observability]
use = egg:plone.observability#observability
Manual WSGI wrapping
from plone.observability.metrics.providers.request import ObservabilityMiddleware
application = ObservabilityMiddleware(application)
OpenTelemetry Tracing (optional)
Install the extra to enable distributed tracing:
pip install "plone.observability[opentelemetry]"
Tracing is OTel-native: it honors the standard OTEL_* environment
variables and auto-activates when the extra is installed and an OTLP endpoint is
configured. PLONE_OBSERVABILITY_OTEL_ENABLED is the master on/off override.
| Variable | Purpose |
|---|---|
OTEL_EXPORTER_OTLP_ENDPOINT |
OTLP collector endpoint (enables tracing) |
OTEL_SERVICE_NAME |
Service name on emitted spans |
OTEL_TRACES_SAMPLER |
Sampling strategy |
PLONE_OBSERVABILITY_OTEL_ENABLED |
1/0 master override |
PLONE_OBSERVABILITY_OTEL_USER_ID |
include enduser.id (PII) on spans; default off |
Add the egg:plone.observability#opentelemetry filter to your WSGI pipeline for
the root request span — see WSGI Middleware for Request Metrics
above (the wsgi_filters example wires both filters at once). Without it you
still get the publishing, catalog, and commit spans (registered via ZCML), just
not the outer WSGI/HTTP span.
Emitted spans (depth: request + key Plone internals):
- root request span (WSGI)
ZPublisher.publish— one per request, withhttp.routecatalog.searchResults/catalog.unrestrictedSearchResults— per catalog query (standard Plone and plone-pgcatalog), withplone.catalog.result_counttransaction.commit— per ZODB transaction completion
The ZPublisher.publish span also carries enduser.authenticated (always) and,
when PLONE_OBSERVABILITY_OTEL_USER_ID is enabled, enduser.id.
Application code can open child spans with the dependency-optional helper (a no-op when the extra is not installed):
from plone.observability.spans import start_span
with start_span("myapp.expensive_step", {"items": n}):
do_work()
Extensibility
All components are registered via ZCA and can be extended or replaced by third-party packages.
Custom liveness check
Implement ILivenessCheck and register it as a named utility. Liveness checks MUST NOT access ZODB or block.
from zope.interface import implementer
from plone.observability.interfaces import ILivenessCheck
@implementer(ILivenessCheck)
class MyLivenessCheck:
name = "myapp"
def __call__(self):
# Return (ok: bool, message: str)
return True, "all good"
<utility
factory=".checks.MyLivenessCheck"
provides="plone.observability.interfaces.ILivenessCheck"
name="myapp"
/>
Custom readiness check
Implement IReadinessCheck. Readiness checks may access ZODB.
from zope.interface import implementer
from plone.observability.interfaces import IReadinessCheck
@implementer(IReadinessCheck)
class MyReadinessCheck:
name = "myapp"
def __call__(self):
# Check a dependency
ok = _check_external_service()
return ok, "service ok" if ok else "service unavailable"
<utility
factory=".checks.MyReadinessCheck"
provides="plone.observability.interfaces.IReadinessCheck"
name="myapp"
/>
Custom metric provider
Implement IMetricProvider as an adapter on OFS.interfaces.IApplication.
from zope.interface import implementer
from plone.observability.interfaces import IMetricProvider
from plone.observability.metric import Metric
@implementer(IMetricProvider)
class MyMetricProvider:
name = "myapp"
scope = "instance"
def __init__(self, context):
self.context = context
def collect(self):
yield Metric(
name="myapp_queue_length",
value=get_queue_length(),
type="gauge",
scope="instance",
help="Number of items in the processing queue",
)
<adapter
factory=".metrics.MyMetricProvider"
provides="plone.observability.interfaces.IMetricProvider"
for="OFS.interfaces.IApplication"
name="myapp"
/>
Custom metric formatter
Implement IMetricFormatter as a named utility to support additional wire formats.
from zope.interface import implementer
from plone.observability.interfaces import IMetricFormatter
@implementer(IMetricFormatter)
class CSVFormatter:
content_type = "text/csv"
def format(self, metrics):
lines = ["name,value,type,scope,help"]
for m in metrics:
lines.append(f"{m.name},{m.value},{m.type},{m.scope},{m.help}")
return "\n".join(lines)
<utility
factory=".formatters.CSVFormatter"
provides="plone.observability.interfaces.IMetricFormatter"
name="csv"
/>
Access it via @@metrics?format=csv.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file plone_observability-1.0.0b7.tar.gz.
File metadata
- Download URL: plone_observability-1.0.0b7.tar.gz
- Upload date:
- Size: 45.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2e9e20a78c3ff0e41b20a88056408dcd5cacd82a75108d6807bc6449428af824
|
|
| MD5 |
f170a933e2453c343c5d8d1763a4f79c
|
|
| BLAKE2b-256 |
f86b6c1f5d538ca63eeee0ce2ad67f7bd043765815897d03a4bc47c4847023ff
|
Provenance
The following attestation bundles were made for plone_observability-1.0.0b7.tar.gz:
Publisher:
release.yaml on plone/plone.observability
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
plone_observability-1.0.0b7.tar.gz -
Subject digest:
2e9e20a78c3ff0e41b20a88056408dcd5cacd82a75108d6807bc6449428af824 - Sigstore transparency entry: 1952191850
- Sigstore integration time:
-
Permalink:
plone/plone.observability@15dfed0f6e94e7213bd48ffedceb4b659384959d -
Branch / Tag:
refs/tags/1.0.0b7 - Owner: https://github.com/plone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@15dfed0f6e94e7213bd48ffedceb4b659384959d -
Trigger Event:
release
-
Statement type:
File details
Details for the file plone_observability-1.0.0b7-py3-none-any.whl.
File metadata
- Download URL: plone_observability-1.0.0b7-py3-none-any.whl
- Upload date:
- Size: 36.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7cc64e23c83a5fa059d2e6371ac3c09ab8780d26010dbdd896c5ab93be18804
|
|
| MD5 |
d001e7157d11a61b16f9d5bc133a53dc
|
|
| BLAKE2b-256 |
351ff1d4de36d82d0d700049eaa16118a1928e66c108816294612322f64b182b
|
Provenance
The following attestation bundles were made for plone_observability-1.0.0b7-py3-none-any.whl:
Publisher:
release.yaml on plone/plone.observability
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
plone_observability-1.0.0b7-py3-none-any.whl -
Subject digest:
a7cc64e23c83a5fa059d2e6371ac3c09ab8780d26010dbdd896c5ab93be18804 - Sigstore transparency entry: 1952192150
- Sigstore integration time:
-
Permalink:
plone/plone.observability@15dfed0f6e94e7213bd48ffedceb4b659384959d -
Branch / Tag:
refs/tags/1.0.0b7 - Owner: https://github.com/plone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@15dfed0f6e94e7213bd48ffedceb4b659384959d -
Trigger Event:
release
-
Statement type: