Skip to main content

Probabilistic SLAM trajectory format and scoring tool

Project description

smfeval

Probabilistic SLAM trajectory format and scoring tool: strictly proper scoring rules and calibration diagnostics for Gaussian, ensemble, and deterministic pose beliefs on SE(3).

Why this exists

Conventional SLAM evaluation (ATE/RPE) summarises a deterministic trajectory error. When the SLAM backend reports an uncertainty over its pose belief — a 6×6 covariance, a particle ensemble — those numbers are the whole point: a filter that drifts but knows it drifts is doing something fundamentally different from one that drifts confidently. smfeval scores the predictive belief, not just its mean, using strictly proper scoring rules (Gneiting & Raftery, 2007):

  • CRPS on the three translation axes (Matheson & Winkler, 1976) and on rotation via the SO(3) geodesic kernel.
  • Energy score on the SE(3) tangent (Székely, 2003; Gneiting & Raftery, 2007).
  • Log score (Good, 1952), reported in three pieces:
    • joint SE(3) negative log density;
    • translation marginal (3-D sub-block of the joint covariance);
    • rotation marginal (3-D sub-block). The 6×6 covariance couples translation and rotation, so a single scalar hides pathologies that target only one block — e.g. a LiDAR filter that is well-calibrated in translation but overconfident in yaw when the geometry degenerates along the motion direction.
  • Interval score on translation magnitude (Gneiting & Raftery, 2007).
  • Calibration diagnostics: PIT + KS test (Dawid, 1984), interval coverage, Mahalanobis-normalised translation residuals.

Prequential scoring, not iid averaging

Scoring runs in walk-forward / prequential mode (Dawid, 1984): at each matched timestep we score the SLAM backend's one-step-ahead predictive against the realised ground-truth pose. The resulting score series is not an iid sample. SLAM error is autocorrelated by construction — drift accumulates, regime changes (loop closure, entering or leaving a degenerate corridor) persist for many frames — so the textbook iid percentile bootstrap on the mean under-covers and hides the most diagnostically interesting structure of the time series.

smfeval therefore aggregates per-step scores with the stationary bootstrap of Politis & Romano (1994), whose mean geometric block length is selected automatically from the data by the Politis & White (2004) flat-top lag-window estimator (hand-rolled in src/scoring/summary.py, no arch dependency). Every score row in the report carries the estimated next to its CI: ℓ ≈ 1 means the series behaves like white noise; ℓ ≫ 1 flags a strong temporal dependence that the scalar mean is summarising over.

Pass block_length=1.0 to summarize() to fall back to the iid percentile bootstrap when you want to compare.

Install

nix develop --command uv sync

The dev shell pins NumPy ≥ 1.24 and SciPy ≥ 1.10; no other runtime dependencies. Docs and tests live under docs/ and tests/.

Use

Validate a trajectory file:

nix develop --command uv run smfeval validate path/to/est.SQUARE

Score an estimate against a ground-truth trajectory (SQUARE format or plain TUM):

nix develop --command uv run smfeval score est.SQUARE gt.tum

The report has six sections — synchronisation, alignment, ensemble diagnostics (when applicable), scores, calibration, recommendations — following the layout in SQUARE_spec.md.

Layout

Module Role
src/scoring/ proper scoring rules + calibration
src/scoring/summary.py Politis–White block-length + stationary bootstrap
src/scoring/logscore.py joint + trans-marginal + rot-marginal log scores
src/se3/ SE(3) / SO(3) Lie group machinery
src/align/ gauge-aware alignment of estimate to ground truth
src/sync/ timestamp matching and sync-risk computation
src/io/ header parser and reader/writer for the SQUARE text format
src/report/ report dataclass, text renderer, recommendations
src/cli/ smfeval validate / smfeval score

References

  • Dawid, A. P. (1984). Statistical theory: the prequential approach. JRSS A 147(2), 278–292.
  • Efron, B. (1979). Bootstrap methods: another look at the jackknife. Annals of Statistics 7(1), 1–26.
  • Gneiting, T. & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. JASA 102(477), 359–378.
  • Good, I. J. (1952). Rational decisions. JRSS B 14(1), 107–114.
  • Matheson, J. E. & Winkler, R. L. (1976). Scoring rules for continuous probability distributions. Management Science 22(10), 1087–1096.
  • Politis, D. N. & Romano, J. P. (1994). The stationary bootstrap. JASA 89(428), 1303–1313.
  • Politis, D. N. & White, H. (2004). Automatic block-length selection for the dependent bootstrap. Econometric Reviews 23(1), 53–70.
  • Sturm, J., Engelhard, N., Endres, F., Burgard, W. & Cremers, D. (2012). A benchmark for the evaluation of RGB-D SLAM systems. IROS.
  • Székely, G. J. (2003). E-statistics: the energy of statistical samples. BGSU Tech. Rep. 03-05.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smfeval-0.2.0.tar.gz (406.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smfeval-0.2.0-py3-none-any.whl (51.8 kB view details)

Uploaded Python 3

File details

Details for the file smfeval-0.2.0.tar.gz.

File metadata

  • Download URL: smfeval-0.2.0.tar.gz
  • Upload date:
  • Size: 406.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for smfeval-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6f13861c458e7423539500e42a62bd0bbf90c92dfb40f8aea1404152827188a3
MD5 74b71cef8084038af7ace161a1a8f16b
BLAKE2b-256 9eb7cde705509f265520f2981aab2e5155e50e5964acb0314d74183780437a7d

See more details on using hashes here.

Provenance

The following attestation bundles were made for smfeval-0.2.0.tar.gz:

Publisher: publish.yml on svendbot/smfeval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smfeval-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: smfeval-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 51.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for smfeval-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a8a546fe0eb5469176eca443e6a86901b382b382e6c85df32255a26fd3415f60
MD5 6f375d3f0a112eefdc09e6754fbba8c2
BLAKE2b-256 4a2a9e4d4d48c89d83226c38b641d63e58b5e047a75381fc09b4f71b8e4eea17

See more details on using hashes here.

Provenance

The following attestation bundles were made for smfeval-0.2.0-py3-none-any.whl:

Publisher: publish.yml on svendbot/smfeval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page