Segment GTFS shapes into stop-to-stop route segments with loop-aware projection
Project description
gtfs-segments
Segment GTFS shapes into stop-to-stop route segments that follow the original road geometry.
Most GTFS tools use Shapely's .project() to snap stops to shapes, which breaks on routes with loops, lollipop turnarounds, and out-and-back corridors. This library uses sequential projection with candidate-aware selection to handle complex route geometries correctly.
Install
pip install gtfs-segments
For interactive map visualization:
pip install gtfs-segments[viz]
Quick Start
Segment an entire feed
from gtfs_segments import segment_feed
result = segment_feed("path/to/gtfs")
segments = result['segments'] # GeoDataFrame of all stop-to-stop segments
diagnostics = result['diagnostics'] # per-shape stats (offsets, degenerates)
failures = result['failures'] # any shapes that failed
Segment a single shape
import pandas as pd
from gtfs_segments import segment_shape
shapes = pd.read_csv("gtfs/shapes.txt")
trips = pd.read_csv("gtfs/trips.txt")
stops = pd.read_csv("gtfs/stops.txt")
stop_times = pd.read_csv("gtfs/stop_times.txt")
result = segment_shape(shapes, trips, stops, stop_times, shape_id=232864)
segments = result['segments']
projected = result['projected_stops']
ruler = result['ruler']
diag = result['diagnostics']
Low-level control
from gtfs_segments import ShapeRuler, project_stops_sequential, segment_route
ruler = ShapeRuler(shapes, shape_id=232864)
# Get stop visit order from stop_times
trip_id = trips[trips['shape_id'] == 232864]['trip_id'].iloc[0]
st = stop_times[stop_times['trip_id'] == trip_id].sort_values('stop_sequence')
stop_sequence = st['stop_id'].tolist()
route_stops = stops[stops['stop_id'].isin(stop_sequence)].copy()
# Project stops onto shape and build segments
projected = project_stops_sequential(ruler, route_stops, stop_sequence)
segments = segment_route(ruler, projected)
How It Works
The Problem
Shapely's .project() finds the globally nearest point on a LineString. On a route that doubles back on itself, a stop at the base of a loop has two valid projection points — the entry and the exit. .project() picks whichever is geometrically closer, often the exit, which causes every stop inside the loop to collapse to the same distance.
The Solution
-
ShapeRuler converts the shape to a metric CRS, builds a cumulative distance ruler, and indexes every segment for fast searching.
-
find_candidates scans every segment and collects all positions where a stop projects within tolerance (default 50m). A stop at a loop base gets candidates at both entry and exit.
-
project_stops_sequential processes stops in visit order (from
stop_times), enforcing monotonic distances. For each stop, it picks the earliest candidate with a reasonable offset — biasing toward forward progress along the route rather than jumping ahead to a later occurrence. -
segment_route slices the shape between consecutive stop distances, including all original vertices so segments follow the actual road geometry.
CRS Auto-Detection
If no CRS is provided, the library auto-detects the UTM zone from the median longitude of the shape coordinates. Override with metric_crs="EPSG:32614" if needed.
Diagnostics
The diagnostics DataFrame flags potential issues:
- max_offset_m: Largest perpendicular distance between a stop and the shape. Over 30m usually means the stop is set back from the road (transit centers, park-and-rides) or assigned to the wrong shape.
- n_high_offset: Count of stops with offset > 30m.
- n_degenerate: Count of segments shorter than 5m. Usually near-side/far-side stop pairs at intersections.
Requirements
- Python ≥ 3.9
- geopandas ≥ 0.12
- shapely ≥ 2.0
- pandas ≥ 1.5
- numpy ≥ 1.23
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gtfs_segmenter-0.1.0-py3-none-any.whl.
File metadata
- Download URL: gtfs_segmenter-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a92b026aaf06dd0920e127f3a6471090383a5ffc5a2ed5509f4de08e542698e
|
|
| MD5 |
29fe5777785e0b4ec269606f4451fa04
|
|
| BLAKE2b-256 |
5e63360e340d1eaf1773c98de3a10fde11978460619ba3fdc0b49c81c48e1886
|