Skip to main content

AIS trajectory segmentation and feature-preserving compression for maritime traffic analysis.

Project description

AISsegments

A Python toolkit for compressing AIS vessel trajectories into compact, query-friendly linestring segments — without losing the kinematic features that maritime safety and risk analysis depend on.

What is this?

aissegments takes raw AIS position reports (millions of pings per day in a busy sea area) and outputs a small number of constant-COG/SOG linestring segments per vessel, one row per (vessel, time-window, course/speed). The output is shaped to drop straight into a PostGIS LINESTRING table.

It is the reference Python implementation of the Top-Down Kinematic Compression (TDKC) algorithm of Guo, Bolbot & Valdez Banda (Ocean Engineering 312 (2024), 119189), with the recursion-termination and adaptive-threshold fixes described in the paper.

Goals

  1. Shrink AIS data without losing the maritime-relevant features. A continent-scale AIS feed easily produces hundreds of millions of position reports per month. Most of those points are uninformative — a vessel cruising in a straight line at a steady speed needs only its endpoints. TDKC keeps the points where vessels actually do something interesting (turn, accelerate, stop, manoeuvre) and drops the rest.
  2. Treat AIS as a sequence of kinematic states, not just positions. The classical Douglas-Peucker simplification looks only at how far points stray from a straight line. That throws away course and speed changes that lie on a straight track, which are exactly the events maritime risk analysis cares about. TDKC uses both position (Synchronous Euclidean Distance) and velocity (Synchronous Velocity Difference) with adaptive per-track thresholds.
  3. Produce database-ready output. The output isn't a smaller list of points — it's a list of Segment records with start/end coordinates, mean COG/SOG, and the count of original observations spanned. Each segment is a 2-point LINESTRING ready for ST_Intersects and other PostGIS operations.
  4. Stay framework-neutral. The core depends only on NumPy. AISdb is an optional adapter (pip install "aissegments[aisdb]"); other input paths (CSV from Marine Cadastre / institutional exports / your own pipeline) work via read_csv_tracks without any extra dependencies.

Who is this for?

  • Maritime risk analysts who need to run collision/grounding/allision queries against millions of vessel positions per area-year and want the spatial+temporal index to fit in memory.
  • AIS data engineers maintaining a Postgres/PostGIS warehouse of vessel tracks and looking for a principled way to densify ingestion without overwhelming storage.
  • Researchers reproducing or extending Guo et al.'s adaptive trajectory compression work.

What it produces

  • tdkc(track) — same Track interface but with only the key points retained (typically 1-5% of the input, depending on track shape and threshold tuning).
  • tdkc_segments(track) — a list of Segment records, one per consecutive key-point pair, ready for direct insertion as PostGIS LINESTRING(start_lon start_lat, end_lon end_lat) geometries with cog_mean, sog_mean, and n_points (count of original AIS pings each segment represents).

Why not just use Douglas-Peucker?

DP and its variants throw away every point that lies on a straight line, regardless of whether the vessel's behaviour is changing. A vessel slowing from 15 to 5 knots while continuing to head east — DP keeps two points (start, end) and you lose the entire speed change. TDKC keeps the deceleration point because its velocity vector has shifted. See docs/algorithm.md for the precise math, and examples/output/03_min_svd_sweep.png for a visual side-by-side.

Companion package

OMRAT (Open Maritime Risk Analysis Tool) — a QGIS plugin for collision/grounding/allision risk modelling — uses AISsegments as its segment-ingestion backend. The OMRAT pipeline shows a complete end-to-end flow: NMEA / CSV → aisdb decode → TDKC compression → bulk-load into a year-partitioned PostGIS schema.

Install

pip install aissegments

# with the optional AISdb adapter for ingestion from raw NMEA / CSV
pip install "aissegments[aisdb]"

For development with full test + coverage tooling:

git clone https://github.com/axelHorteborn/AISsegments
cd AISsegments
pip install -e ".[dev,aisdb]"
pytest --cov

Quickstart

import numpy as np
from aissegments import Track, tdkc, tdkc_segments

# Build a Track from your own arrays (lat/lon in degrees, sog in knots, cog in degrees).
track = Track.from_arrays(
    mmsi=219000123,
    t=np.array([0, 60, 120, 180, 240], dtype=float),     # unix seconds
    lon=np.array([12.0, 12.001, 12.002, 12.003, 12.004]),
    lat=np.array([55.0, 55.0, 55.0, 55.0, 55.0]),
    sog=np.array([10.0, 10.0, 10.0, 10.0, 10.0]),
    cog=np.array([90.0, 90.0, 90.0, 90.0, 90.0]),
)

# Compress: returns a Track containing only the key points.
compressed = tdkc(track)
print(len(compressed), "key points kept out of", len(track))

# Or go straight to segment records (one per consecutive key-point pair).
segments = tdkc_segments(track)
for s in segments:
    print(s.t_start, s.t_end, s.cog_mean, s.sog_mean, s.n_points)

Using AISdb as an input adapter

aissegments can consume the per-vessel track dicts produced by AISdb's TrackGen():

import aisdb
from aissegments.adapters import from_aisdb_track
from aissegments import tdkc_segments

with aisdb.SQLiteDBConn(dbpath="ais.db") as conn:
    qry = aisdb.DBQuery(start=..., end=..., callback=aisdb.sql_query_strs.in_bbox_time)
    tracks = aisdb.TrackGen(qry.gen_qry(), decimate=False)
    for t_dict in tracks:
        track = from_aisdb_track(t_dict)
        for seg in tdkc_segments(track):
            ...  # write seg to your PostGIS table

What's in the package

Module Purpose
aissegments.tdkc TDKC algorithm: SED + SVD, Compression Binary Tree, adaptive thresholds, key-node identification
aissegments._types Track and Segment dataclasses, to_segments helper
aissegments.adapters Input adapters: from_aisdb_track, read_csv_tracks (Marine Cadastre etc.), read_csv_static_records for vessel-info extraction

Algorithm details

See docs/algorithm.md for the mathematical formulation, with equation references back to the source paper.

Citation

If you use this package in academic work, please cite both the software and the underlying paper:

Guo, S., Bolbot, V., & Valdez Banda, O. (2024). An adaptive trajectory compression and feature preservation method for maritime traffic analysis. Ocean Engineering, 312, 119189. https://doi.org/10.1016/j.oceaneng.2024.119189

A CITATION.cff is included so GitHub renders a "Cite this repository" widget.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aissegments-0.2.0.tar.gz (26.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aissegments-0.2.0-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file aissegments-0.2.0.tar.gz.

File metadata

  • Download URL: aissegments-0.2.0.tar.gz
  • Upload date:
  • Size: 26.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aissegments-0.2.0.tar.gz
Algorithm Hash digest
SHA256 fbcb82629fc3e082f961331bd1db3755d04b1c5b895c52e769557fb34b65a3a7
MD5 388e75bf73c20087946cdc1e15e3dd0b
BLAKE2b-256 7f24706c2a47b4bd75069985548d4b7f0b082a62da6c7d535701f4c93731fe7a

See more details on using hashes here.

Provenance

The following attestation bundles were made for aissegments-0.2.0.tar.gz:

Publisher: release.yml on axelHorteborn/AISsegments

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aissegments-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: aissegments-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aissegments-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 75bec028eece2c8e3befb301820fe3430339e46be1955b99c5e611f1620c95f0
MD5 7f47a22908ac7432ef8beda84006b7c5
BLAKE2b-256 33b47b1a6da10c3efd518c7a9c3de432c1e8d0e06af0d7e360cdff488385dc92

See more details on using hashes here.

Provenance

The following attestation bundles were made for aissegments-0.2.0-py3-none-any.whl:

Publisher: release.yml on axelHorteborn/AISsegments

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page