High-performance trajectory splitting and analysis, powered by Rust
Project description
trucktrack
High-performance trajectory splitting, generation, and partitioning, powered by Rust.
A Python package implementing logic similar to
movingpandas trajectory splitters
(ObservationGapSplitter, StopSplitter) with a Rust backend for speed.
Data flows through Polars DataFrames, with the option
to process entirely in Rust (parquet in, parquet out) or share DataFrames
between Python and Rust zero-copy via pyo3-polars.
In addition to the Rust splitters, trucktrack ships pure-Python subpackages for trace generation, spatial partitioning, map-matching, querying, and visualization.
Install
pip install trucktrack
Optional extras:
pip install trucktrack[valhalla] # local pyvalhalla routing & map-matching
pip install trucktrack[viz] # folium-based interactive maps
From source
# Requires Python 3.11+ and Rust stable
git clone https://github.com/twedl/trucktrack.git
cd trucktrack
python3 -m venv .venv && source .venv/bin/activate
pip install "maturin>=1.7,<2.0" polars pytest
maturin develop
Pipelines
Split + partition
Process raw GPS traces into a spatially partitioned hive dataset:
from pathlib import Path
from trucktrack import run_pipeline
run_pipeline(Path("data/raw"), Path("data/partitioned"))
# Group input chunks for fewer output files (uses more memory per worker)
run_pipeline(Path("data/raw"), Path("data/partitioned"), group_size=256)
# Compact multi-file partitions into single files after processing
run_pipeline(Path("data/raw"), Path("data/partitioned"), compact=True)
To compact an existing dataset without re-running the pipeline:
from trucktrack import compact_partitions
compact_partitions("data/partitioned")
Building Valhalla tiles
Map-matching and route generation need a local Valhalla install. One-time setup (downloads Ontario from Geofabrik, builds config + admins + tiles):
uv run python scripts/setup_valhalla.py
Produces valhalla_tiles/valhalla.json, valhalla_tiles/admin.sqlite, and
valhalla_tiles/valhalla_tiles.tar — all gitignored. find_config()
discovers the json automatically. Pass --pbf path.osm.pbf to reuse an
existing OSM extract.
Map-match
Map-match all trips against a local Valhalla instance:
from trucktrack.valhalla.pipeline import run_map_matching
run_map_matching(
Path("data/partitioned"),
Path("data/matched"),
# config="valhalla.json" # omit to auto-discover in cwd
)
Bridging large gaps
Map-match cost scales poorly when a trip has a large inter-point gap —
Meili searches candidate routes across the gap and can spend seconds
or minutes on a single trip. Opt in to gap-splitting by passing a
BridgeConfig: trips are split at gaps, each sub-segment is matched
normally, and gaps are filled in with a single /route + edge_walk
call to recover way IDs:
from trucktrack.valhalla import BridgeConfig, run_map_matching
run_map_matching(
Path("data/partitioned"),
Path("data/matched"),
bridges=BridgeConfig(max_dist_m=5000, time_s=240, min_dist_m=1000),
)
A gap triggers a split when the distance between consecutive points
exceeds max_dist_m, or when the time exceeds time_s and the
distance exceeds min_dist_m (the distance floor keeps red-light
stalls from splitting). The /route bridge assumes the truck took a
shortest path through the gap — true detours are invisible. Per-trip
quality rows pick up n_bridges, max_detour_ratio,
total_bridge_m, and any_bridge_failed so downstream can filter.
On any /route failure the orchestrator falls back to a single full
HMM call with breakage_distance pinned to the base value and sets
any_bridge_failed=True.
Querying
Pull individual trucks or trips without scanning the full dataset.
Each function filters by chunk_id (last 2 hex chars of the truck UUID)
to read only the relevant files:
import trucktrack as tt
# Raw traces — filters by chunk_id hive partition
df = tt.scan_raw_truck("data/raw", truck_id).collect()
# Partitioned trips — filters by chunk_id in filename
df = tt.scan_partitioned_truck("data/partitioned", truck_id).collect()
df = tt.scan_partitioned_trip("data/partitioned", trip_id).collect()
# Map-matched results
df = tt.scan_matched_truck("data/matched", truck_id).collect()
df = tt.scan_matched_trip("data/matched", trip_id).collect()
ChunkIndex — persistent file-path index
For repeated queries, build an index once and reload it instantly in later sessions:
# First time — one rglob, then save to disk
idx = tt.ChunkIndex.build("data/partitioned")
idx.save() # writes .chunk_index.json
# Later sessions — instant load, no filesystem scan
idx = tt.ChunkIndex.load("data/partitioned")
df = idx.scan_truck(truck_id).collect()
df = idx.scan_trip(trip_id).collect()
Visualization
One-call helpers to query, plot, and serve an interactive map:
from trucktrack.visualize import inspect_truck, inspect_trip
# All trips for a truck — opens a Flask server
inspect_truck("data/partitioned", truck_id)
# Filter to a date range
from datetime import date
inspect_truck("data/partitioned", truck_id,
date_range=(date(2025, 1, 1), date(2025, 3, 1)))
# Single trip or multiple trips
inspect_trip("data/partitioned", trip_id)
inspect_trip("data/partitioned", [trip_id_1, trip_id_2])
# Use a ChunkIndex for fast lookups on large datasets
idx = tt.ChunkIndex.load("data/partitioned")
inspect_truck("data/partitioned", truck_id, index=idx)
# Raw traces or matched results
inspect_truck("data/raw", truck_id, stage="raw")
inspect_trip("data/matched", trip_id, stage="matched")
# Get the map object without serving (e.g. for Jupyter display)
m = inspect_trip("data/partitioned", trip_id, serve=False)
# Forward kwargs to plot_trace
inspect_trip("data/partitioned", trip_id, color_by="speed")
Multi-stage overlay
Compare raw GPS, trip segments, and map-matched results on one map:
from trucktrack.visualize import inspect_pipeline
# All stages for one truck
inspect_pipeline(
truck_id,
raw_dir="data/raw",
partitioned_dir="data/partitioned",
matched_dir="data/matched",
)
# Scope to specific trips (raw layer auto-filtered to matching dates)
inspect_pipeline(
trip_id=[trip_id_1, trip_id_2],
raw_dir="data/raw",
partitioned_dir="data/partitioned",
partitioned_index=idx,
)
For more control, use the lower-level plot_trace, plot_trace_layers,
save_map, and serve_map functions directly from trucktrack.visualize.
Dev workflow
| Task | Command |
|---|---|
| Build | maturin develop |
| Tests | pytest tests/ -v |
| Lint Python | ruff check python/ tests/ |
| Format Python | ruff format python/ tests/ |
| Lint Rust | cargo clippy --all-targets --all-features -- -D warnings |
| Format Rust | cargo fmt --all |
| Type-check | mypy python/trucktrack |
| Build wheel | maturin build --release |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trucktrack-0.1.33.tar.gz.
File metadata
- Download URL: trucktrack-0.1.33.tar.gz
- Upload date:
- Size: 212.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54b53dcbaa9037d843680f05e66a3b92d17b365880f515b8fdcc790054d214b8
|
|
| MD5 |
f7a883c4ed8a4a6abd62c61b3b05de87
|
|
| BLAKE2b-256 |
534155121852879979421f52015f92049bfbaeee131643abd08819a7598bcf22
|
Provenance
The following attestation bundles were made for trucktrack-0.1.33.tar.gz:
Publisher:
publish.yml on twedl/trucktrack
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
trucktrack-0.1.33.tar.gz -
Subject digest:
54b53dcbaa9037d843680f05e66a3b92d17b365880f515b8fdcc790054d214b8 - Sigstore transparency entry: 1424273726
- Sigstore integration time:
-
Permalink:
twedl/trucktrack@e84e954213c30c3dcf502a36d6639854502c01f9 -
Branch / Tag:
refs/tags/v0.1.33 - Owner: https://github.com/twedl
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e84e954213c30c3dcf502a36d6639854502c01f9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file trucktrack-0.1.33-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: trucktrack-0.1.33-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 16.7 MB
- Tags: CPython 3.11+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3dc701b518f34a8df57b84c2841544fde48eda0e0bdfa345f118ca5c05fa3a66
|
|
| MD5 |
2e597e67c1cc260ac0eb12bfab2c9863
|
|
| BLAKE2b-256 |
91fe51bac86f3de722004e4bb1aa2cf1181e8e5774add5ed0ed60321a2109eae
|
Provenance
The following attestation bundles were made for trucktrack-0.1.33-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
publish.yml on twedl/trucktrack
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
trucktrack-0.1.33-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
3dc701b518f34a8df57b84c2841544fde48eda0e0bdfa345f118ca5c05fa3a66 - Sigstore transparency entry: 1424273812
- Sigstore integration time:
-
Permalink:
twedl/trucktrack@e84e954213c30c3dcf502a36d6639854502c01f9 -
Branch / Tag:
refs/tags/v0.1.33 - Owner: https://github.com/twedl
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e84e954213c30c3dcf502a36d6639854502c01f9 -
Trigger Event:
push
-
Statement type: