Skip to main content

Fast GTFS validator written in Rust

Project description

GTFS Guru Python

High-performance GTFS feed validator with Python bindings. Written in Rust, exposed via PyO3.

Installation

pip install gtfs-guru

From Source

pip install maturin
maturin build --release
pip install target/wheels/gtfs_guru-*.whl

Quick Start

import gtfs_guru

# Validate a GTFS feed
result = gtfs_guru.validate("/path/to/gtfs.zip")

print(f"Valid: {result.is_valid}")
print(f"Errors: {result.error_count}")
print(f"Warnings: {result.warning_count}")

# Print errors
for error in result.errors():
    print(f"{error.code}: {error.message}")

API Reference

Functions

validate(path, country_code=None, date=None) -> ValidationResult

Validate a GTFS feed.

Parameters:

  • path (str): Path to GTFS zip file or directory
  • country_code (str, optional): ISO country code (e.g., "US", "RU")
  • date (str, optional): Validation date in YYYY-MM-DD format

Returns: ValidationResult object

Example:

result = gtfs_guru.validate(
    "/path/to/gtfs.zip",
    country_code="US",
    date="2025-01-15"
)

async validate_async(path, country_code=None, date=None, on_progress=None) -> ValidationResult

Validate a GTFS feed asynchronously (non-blocking).

Parameters:

  • path (str): Path to GTFS zip file or directory
  • country_code (str, optional): ISO country code
  • date (str, optional): Validation date in YYYY-MM-DD format
  • on_progress (Callable[[ProgressInfo], None], optional): Callback for progress updates

Example:

import asyncio

async def main():
    def on_progress(info):
        print(f"{info.stage}: {info.current}/{info.total}")
        
    result = await gtfs_guru.validate_async(
        "/path/to/gtfs.zip",
        on_progress=on_progress
    )
    
asyncio.run(main())

version() -> str

Get validator version.

>>> gtfs_guru.version()
'0.9.0'

notice_codes() -> list[str]

Get list of all available notice codes.

>>> len(gtfs_guru.notice_codes())
164
>>> gtfs_guru.notice_codes()[:3]
['attribution_without_role', 'bidirectional_exit_gate', 'block_trips_with_overlapping_stop_times']

notice_schema() -> dict

Get schema for all notice types with descriptions and severity levels.

>>> schema = gtfs_guru.notice_schema()
>>> schema['missing_required_field']
{'severity': 'ERROR', 'description': '...'}

Classes

ValidationResult

Result of GTFS validation.

Attributes:

Attribute Type Description
is_valid bool True if no errors
error_count int Number of errors
warning_count int Number of warnings
info_count int Number of info notices
validation_time_seconds float Validation time in seconds
notices list[Notice] All validation notices

Methods:

# Get notices by severity
errors = result.errors()       # List[Notice]
warnings = result.warnings()   # List[Notice]
infos = result.infos()        # List[Notice]

# Filter by notice code
notices = result.by_code("missing_required_field")  # List[Notice]

# Export
result.save_json("/path/to/report.json")
result.save_html("/path/to/report.html")
json_str = result.to_json()   # str
report = result.to_dict()     # dict

Notice

A single validation notice.

Attributes:

Attribute Type Description
code str Notice code (e.g., "missing_required_field")
severity str "ERROR", "WARNING", or "INFO"
message str Human-readable message
file str | None GTFS filename
row int | None CSV row number
field str | None Field name

Methods:

# Get context field
value = notice.get("fieldName")  # Any | None

# Get all context
ctx = notice.context()  # dict[str, Any]

Examples

Basic Validation

import gtfs_guru

result = gtfs_guru.validate("/path/to/gtfs.zip")

if result.is_valid:
    print("Feed is valid!")
else:
    print(f"Found {result.error_count} errors")
    for error in result.errors():
        print(f"  - {error.code}: {error.message}")

Detailed Analysis

from collections import Counter

result = gtfs_guru.validate("/path/to/gtfs.zip")

# Count notices by code
error_counts = Counter(e.code for e in result.errors())
for code, count in error_counts.most_common(10):
    print(f"{code}: {count}")

# Find all missing required fields
for notice in result.by_code("missing_required_field"):
    file = notice.file
    field = notice.get("fieldName")
    row = notice.row
    print(f"{file}:{row} - missing {field}")

Save Reports

result = gtfs_guru.validate("/path/to/gtfs.zip")

# Save JSON report (same format as Java validator)
result.save_json("report.json")

# Save HTML report
result.save_html("report.html")

# Get as Python dict
report = result.to_dict()
summary = report["summary"]
print(f"Agencies: {summary.get('agencies', [])}")
print(f"Routes: {summary.get('routes', {}).get('count', 0)}")

Validation with Options

# Validate for specific country (affects some rules)
result = gtfs_guru.validate(
    "/path/to/gtfs.zip",
    country_code="DE"
)

# Validate as of specific date
result = gtfs_guru.validate(
    "/path/to/gtfs.zip",
    date="2025-06-01"
)

Supported Platforms

Platform Architecture Python
macOS ARM64 (M1/M2) 3.8+
macOS x86_64 (Intel) 3.8+
Windows x86_64 3.8+
Linux x86_64 3.8+

Performance

Typical validation times (compared to Java validator):

Feed Size Java Rust/Python
Small (<1MB) ~2s ~0.05s
Medium (10MB) ~10s ~0.5s
Large (100MB) ~60s ~3s

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gtfs_guru-0.9.2.tar.gz (244.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gtfs_guru-0.9.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

File details

Details for the file gtfs_guru-0.9.2.tar.gz.

File metadata

  • Download URL: gtfs_guru-0.9.2.tar.gz
  • Upload date:
  • Size: 244.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.11.5

File hashes

Hashes for gtfs_guru-0.9.2.tar.gz
Algorithm Hash digest
SHA256 70534dd22402f596efe0f3a5e0038be455282cc26620eab89ca9ee3dc76b7c35
MD5 333f4fbfbce1fb436572a4e1160d95e9
BLAKE2b-256 c79684d0f347508b317b0f8e92ca5b70326b1bf8aa42b3766245002898089e2a

See more details on using hashes here.

File details

Details for the file gtfs_guru-0.9.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gtfs_guru-0.9.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bceee49d7a9120873076522493911269ac820ce62e3cf77a443ec8de7583f849
MD5 3e3d5f32f904cb37a37a9d1d4d81a617
BLAKE2b-256 2e005e13e11517ebf746e624257e245601f7dad7fa084f339cfa3570148aa521

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page