Extract recurring transit DayTypes and compare GTFS schedules.
Project description
Minimal Python package for extracting recurring public transit DayTypes from standard GTFS feeds and comparing their schedule sets.
The package implements the core methodology from the DayType paper:
stable Route Pattern keys from ordered stop coordinates using H3 cells;
date-level DayType extraction from GTFS calendars, trips, and stop times;
exact schedule-set comparison;
time-tolerant one-to-one trip matching for small timetable shifts.
Install for Development
uv sync --extra dev
Example Data
The repository includes two small GTFS snapshots under Git LFS for examples and paper-method validation:
examples/data/mdb-3-202402080013.zip: Barrie Transit.
examples/data/mdb-734-202602180121.zip: Halifax Transit.
After cloning the repository, make sure Git LFS files are present:
git lfs pull
All commands below should be run from the package root:
cd packages/gtfs-daytype
Quick CLI Tour
The package installs a command named gtfs-daytype.
Show help:
uv run gtfs-daytype --help
Extract DayTypes from a GTFS feed:
uv run gtfs-daytype extract examples/data/mdb-3-202402080013.zip --out results
This writes:
results/daytypes.csv: one row per DayType;
results/calendar_daytypes.csv: date-to-DayType assignment;
results/daytype_trips.csv: Route-Pattern-time trip specifications;
results/route_patterns.csv: stable Route Pattern keys and stop sequences.
Compare DayTypes with exact and time-tolerant metrics:
uv run gtfs-daytype compare examples/data/mdb-3-202402080013.zip --epsilon 0 1 3 5 --out results
This additionally writes results/daytype_similarity.csv.
Command Line Examples
Use any GTFS .zip archive or extracted GTFS folder as input.
Extract DayTypes and write output CSV files:
uv run gtfs-daytype extract examples/data/mdb-3-202402080013.zip --out results/barrie
Extract using direct stop-id keys instead of H3 keys. This is useful for debugging because the key is based on the GTFS stop sequence itself:
uv run gtfs-daytype extract examples/data/mdb-3-202402080013.zip \
--key-method stop_ids \
--out results/barrie-stop-ids
Compare DayTypes with exact matching only:
uv run gtfs-daytype compare examples/data/mdb-3-202402080013.zip --epsilon 0 --out results/barrie
Compare DayTypes with exact plus one-, three-, and five-minute tolerances:
uv run gtfs-daytype compare examples/data/mdb-3-202402080013.zip \
--epsilon 0 1 3 5 \
--out results/barrie
Print the full date-to-DayType calendar:
uv run gtfs-daytype calendar examples/data/mdb-3-202402080013.zip
Print only a date range:
uv run gtfs-daytype calendar examples/data/mdb-3-202402080013.zip \
--start-date 2024-01-01 \
--end-date 2024-01-31
Print and save the calendar table:
uv run gtfs-daytype calendar examples/data/mdb-3-202402080013.zip --out results/calendar.csv
Inspect Route Pattern keys for one route:
uv run gtfs-daytype inspect-route examples/data/mdb-3-202402080013.zip --route-id ROUTE_ID
Inspect only selected shape_id values on a route:
uv run gtfs-daytype inspect-route examples/data/mdb-3-202402080013.zip \
--route-id ROUTE_ID \
--shape-id SHAPE_ID_1 \
--shape-id SHAPE_ID_2 \
--out results/route-inspection.csv
Output Files
extract and compare write these files:
- daytypes.csv
One row per extracted DayType, with representative date, number of dates, and number of trip specifications.
- calendar_daytypes.csv
Date-to-DayType assignment.
- daytype_trips.csv
Trip specifications defining each DayType: Route Pattern key, first departure time, last arrival time, and source trip_id for traceability.
- route_patterns.csv
Stable Route Pattern keys, ordered H3 cells, and ordered stop identifiers.
- daytype_similarity.csv
Pairwise exact or time-tolerant schedule similarity. Written by compare.
Tutorial 1: Reproduce a Paper Case Study
Use the included Barrie Transit GTFS snapshot:
Run extraction:
uv run gtfs-daytype extract \
examples/data/mdb-3-202402080013.zip \
--out results/mdb-3-202402080013
Expected terminal summary:
DayTypes: 3
Dates assigned: 84
DT0: representative_date=2024-01-12, dates=60, trips=658
DT1: representative_date=2024-01-13, dates=13, trips=571
DT2: representative_date=2024-01-14, dates=11, trips=307
Tutorial 2: Compare DayTypes
Compute exact and tolerant schedule similarity:
uv run gtfs-daytype compare \
examples/data/mdb-3-202402080013.zip \
--epsilon 0 1 3 5 \
--out results/mdb-3-202402080013
Inspect the output:
head results/mdb-3-202402080013/daytype_similarity.csv
Columns include:
distance: normalized exact or tolerant DayType distance;
matches: exact or time-tolerant one-to-one matches;
containment_a_in_b and containment_b_in_a: directional containment;
unmatched_a and unmatched_b: unmatched trips after tolerance;
trip_count_imbalance: absolute difference in schedule-set sizes.
Tutorial 3: Print the Date Calendar
Print the date-to-DayType table:
uv run gtfs-daytype calendar \
examples/data/mdb-3-202402080013.zip
Limit to a date range and also save CSV:
uv run gtfs-daytype calendar \
examples/data/mdb-3-202402080013.zip \
--start-date 2024-01-12 \
--end-date 2024-01-21 \
--out results/barrie-calendar-sample.csv
Example output:
date dow daytype representative trips dates
---------- --- ------- -------------- ----- -----
2024-01-12 Fri DT0 2024-01-12 658 60
2024-01-13 Sat DT1 2024-01-13 571 13
2024-01-14 Sun DT2 2024-01-14 307 11
Tutorial 4: Inspect Shape-ID Inconsistency
Barrie Route 2A in the paper uses two different shape_id values for the same passenger-facing stop sequence. The H3 Route Pattern key should collapse them to one key.
uv run gtfs-daytype inspect-route \
examples/data/mdb-3-202402080013.zip \
--route-id c8bb5d6b-0a67-426c-8742-36d10b8c15b8 \
--shape-id 58ea5474-ef60-4dbc-bd74-5c122b012d32 \
--shape-id c0134c50-192b-4a6c-8d93-d0534d3b0fab \
--out results/barrie-route-2a-inspection.csv
Expected terminal summary:
Matched trips: 71
Trips by shape_id:
58ea5474-ef60-4dbc-bd74-5c122b012d32: 58
c0134c50-192b-4a6c-8d93-d0534d3b0fab: 13
Route Pattern keys: 1
Tutorial 5: Validate Paper Feeds
Run validation against the example GTFS snapshots:
uv run python scripts/validate_paper_feeds.py
Expected output:
mdb-3-202402080013.zip: DayType counts match paper values
mdb-734-202602180121.zip: DayType counts match paper values
Barrie Route 2A: two shape_id values collapse to one H3 Route Pattern key
Python
from gtfs_daytype import compare_daytypes, extract_daytypes
result = extract_daytypes('feed.zip')
distances = compare_daytypes(result.daytypes, epsilon_minutes=1)
Python Examples
Basic extraction:
from gtfs_daytype import extract_daytypes
result = extract_daytypes('feed.zip')
for daytype in result.daytypes:
print(daytype.id, daytype.representative_date, len(daytype.dates), len(daytype.trips))
Print date-to-DayType assignments:
from gtfs_daytype import extract_daytypes
result = extract_daytypes('feed.zip')
for current_date, daytype_id in sorted(result.date_to_daytype.items()):
print(current_date, daytype_id)
Compare all DayTypes at a one-minute tolerance:
from gtfs_daytype import compare_daytypes, extract_daytypes
result = extract_daytypes('feed.zip')
rows = compare_daytypes(result.daytypes, epsilon_minutes=1)
for row in rows:
print(row.daytype_a, row.daytype_b, row.distance, row.matches)
Use stop-id keys instead of H3 keys for debugging:
from gtfs_daytype import extract_daytypes
result = extract_daytypes('feed.zip', key_method='stop_ids')
Inspect Route Pattern keys from Python:
from collections import defaultdict
from gtfs_daytype.io import read_gtfs_table
from gtfs_daytype.patterns import build_route_patterns
feed = 'feed.zip'
trips = read_gtfs_table(feed, 'trips.txt')
stop_times = read_gtfs_table(feed, 'stop_times.txt')
stops = read_gtfs_table(feed, 'stops.txt')
patterns = build_route_patterns(trips, stop_times, stops)
by_pattern = defaultdict(list)
for trip in trips:
if trip['route_id'] == 'ROUTE_ID':
by_pattern[patterns[trip['trip_id']].key].append(trip['trip_id'])
for pattern, trip_ids in by_pattern.items():
print(pattern, len(trip_ids))
Runnable example scripts are provided in examples/:
uv run python examples/extract_daytypes.py examples/data/mdb-3-202402080013.zip
uv run python examples/compare_daytypes.py examples/data/mdb-3-202402080013.zip --epsilon 0 1 3 5
uv run python examples/calendar_table.py examples/data/mdb-3-202402080013.zip
uv run python examples/inspect_route_patterns.py examples/data/mdb-3-202402080013.zip --route-id ROUTE_ID
Citation
If you use gtfs-daytype in academic work, please cite the software package. The repository includes CITATION.cff so GitHub and citation managers can generate citation metadata automatically.
Suggested BibTeX entry:
@software{makarov_gtfs_daytype_2026,
author = {Makarov, Evgeny},
title = {gtfs-daytype: GTFS-Based Transit DayType Extraction and Schedule Similarity},
year = {2026},
version = {0.1.0},
url = {https://github.com/emakarov/gtfs-daytype}
}
When citing the methodology rather than only the implementation, cite both this software package and the associated paper once the paper citation is available.
Credits
The initial idea for using DayTypes as recurring public-transit schedule classes was contributed by Georgy Taubkin.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gtfs_daytype-0.1.0.tar.gz.
File metadata
- Download URL: gtfs_daytype-0.1.0.tar.gz
- Upload date:
- Size: 7.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27b1c245a0eb0270864e41e63253da60695762f6e1bc9b79d1180677d8c59b34
|
|
| MD5 |
9d143f6a5f9c04819e986308bc7f7c5e
|
|
| BLAKE2b-256 |
f812371d5e86668cc7086d2ed70af06623d57d81ad88b6c13bf0f81ffe491bc6
|
Provenance
The following attestation bundles were made for gtfs_daytype-0.1.0.tar.gz:
Publisher:
release.yml on emakarov/gtfs-daytype
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gtfs_daytype-0.1.0.tar.gz -
Subject digest:
27b1c245a0eb0270864e41e63253da60695762f6e1bc9b79d1180677d8c59b34 - Sigstore transparency entry: 1435975748
- Sigstore integration time:
-
Permalink:
emakarov/gtfs-daytype@0d0894dd02b8fd500b52086a03b243b5d729b02f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/emakarov
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0d0894dd02b8fd500b52086a03b243b5d729b02f -
Trigger Event:
release
-
Statement type:
File details
Details for the file gtfs_daytype-0.1.0-py3-none-any.whl.
File metadata
- Download URL: gtfs_daytype-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8e6ccd2312d2efcce4f13eda11f9d809f0ac7f2dbadf242e61908f6bb68fd43
|
|
| MD5 |
5eef73e533fb8df4dbe3139b9f880b50
|
|
| BLAKE2b-256 |
1d4db10e04fad677642dbc2cffcce983bceb822a4ba1943300b7191aafcf4988
|
Provenance
The following attestation bundles were made for gtfs_daytype-0.1.0-py3-none-any.whl:
Publisher:
release.yml on emakarov/gtfs-daytype
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gtfs_daytype-0.1.0-py3-none-any.whl -
Subject digest:
e8e6ccd2312d2efcce4f13eda11f9d809f0ac7f2dbadf242e61908f6bb68fd43 - Sigstore transparency entry: 1435975753
- Sigstore integration time:
-
Permalink:
emakarov/gtfs-daytype@0d0894dd02b8fd500b52086a03b243b5d729b02f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/emakarov
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0d0894dd02b8fd500b52086a03b243b5d729b02f -
Trigger Event:
release
-
Statement type: