Skip to main content

Segment GTFS shapes into stop-to-stop route segments with loop-aware projection

Project description

gtfs-route-segments

Segment GTFS shapes into stop-to-stop route segments that follow the original road geometry.

Most GTFS tools use Shapely's .project() to snap stops to shapes, which breaks on routes with loops, lollipop turnarounds, and out-and-back corridors. This library uses sequential projection with candidate-aware selection to handle complex route geometries correctly.

Install

pip install gtfs-route-segments

For interactive map visualization:

pip install gtfs-route-segments[viz]

Quick Start

Segment an entire feed

from gtfs_segments import segment_feed

result = segment_feed("path/to/gtfs")

segments = result['segments']       # GeoDataFrame of all stop-to-stop segments
diagnostics = result['diagnostics'] # per-shape stats (offsets, degenerates)
failures = result['failures']       # any shapes that failed

Segment a single shape

import pandas as pd
from gtfs_segments import segment_shape

shapes = pd.read_csv("gtfs/shapes.txt")
trips = pd.read_csv("gtfs/trips.txt")
stops = pd.read_csv("gtfs/stops.txt")
stop_times = pd.read_csv("gtfs/stop_times.txt")

result = segment_shape(shapes, trips, stops, stop_times, shape_id=232864)

segments = result['segments']
projected = result['projected_stops']
ruler = result['ruler']
diag = result['diagnostics']

Low-level control

from gtfs_segments import ShapeRuler, project_stops_sequential, segment_route

ruler = ShapeRuler(shapes, shape_id=232864)

# Get stop visit order from stop_times
trip_id = trips[trips['shape_id'] == 232864]['trip_id'].iloc[0]
st = stop_times[stop_times['trip_id'] == trip_id].sort_values('stop_sequence')
stop_sequence = st['stop_id'].tolist()
route_stops = stops[stops['stop_id'].isin(stop_sequence)].copy()

# Project stops onto shape and build segments
projected = project_stops_sequential(ruler, route_stops, stop_sequence)
segments = segment_route(ruler, projected)

How It Works

The Problem

Shapely's .project() finds the globally nearest point on a LineString. On a route that doubles back on itself, a stop at the base of a loop has two valid projection points — the entry and the exit. .project() picks whichever is geometrically closer, often the exit, which causes every stop inside the loop to collapse to the same distance.

The Solution

  1. ShapeRuler converts the shape to a metric CRS, builds a cumulative distance ruler, and indexes every segment for fast searching.

  2. find_candidates scans every segment and collects all positions where a stop projects within tolerance (default 50m). A stop at a loop base gets candidates at both entry and exit.

  3. project_stops_sequential processes stops in visit order (from stop_times), enforcing monotonic distances. For each stop, it picks the earliest candidate with a reasonable offset — biasing toward forward progress along the route rather than jumping ahead to a later occurrence.

  4. segment_route slices the shape between consecutive stop distances, including all original vertices so segments follow the actual road geometry.

CRS Auto-Detection

If no CRS is provided, the library auto-detects the UTM zone from the median longitude of the shape coordinates. Override with metric_crs="EPSG:32614" if needed.

Diagnostics

The diagnostics DataFrame flags potential issues:

  • max_offset_m: Largest perpendicular distance between a stop and the shape. Over 30m usually means the stop is set back from the road (transit centers, park-and-rides) or assigned to the wrong shape.
  • n_high_offset: Count of stops with offset > 30m.
  • n_degenerate: Count of segments shorter than 5m. Usually near-side/far-side stop pairs at intersections.

Requirements

  • Python ≥ 3.9
  • geopandas ≥ 0.12
  • shapely ≥ 2.0
  • pandas ≥ 1.5
  • numpy ≥ 1.23

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gtfs_route_segments-0.1.1.tar.gz (14.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gtfs_route_segments-0.1.1-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file gtfs_route_segments-0.1.1.tar.gz.

File metadata

  • Download URL: gtfs_route_segments-0.1.1.tar.gz
  • Upload date:
  • Size: 14.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for gtfs_route_segments-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7398948ef262e1eed7f9f793a125da70ba6634d1e92aad951885e2b61a76557d
MD5 4881bf7b0d179be1c0054d95e7623906
BLAKE2b-256 e36f38dc2a55e106ff7a8130c32d80245777d81ec0dfd587969e393542db5795

See more details on using hashes here.

File details

Details for the file gtfs_route_segments-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for gtfs_route_segments-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3b798073aaf42a4522a09190b11f5da42065468ebef4431fc276c84999214691
MD5 d3bf9efa89899287540eec5655aee646
BLAKE2b-256 b409f654d7110bf9296fa234dee56be33d5cdeb0304f923fd44a6ed05366d73d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page