Skip to main content

Fast GTFS validator written in Rust

Project description

GTFS Guru Python

High-performance GTFS feed validator with Python bindings. Written in Rust, exposed via PyO3.

Installation

pip install gtfs-guru

From Source

pip install maturin
maturin build --release
pip install target/wheels/gtfs_guru-*.whl

Quick Start

import gtfs_guru

# Validate a GTFS feed
result = gtfs_guru.validate("/path/to/gtfs.zip")

print(f"Valid: {result.is_valid}")
print(f"Errors: {result.error_count}")
print(f"Warnings: {result.warning_count}")

# Print errors
for error in result.errors():
    print(f"{error.code}: {error.message}")

API Reference

Functions

validate(path, country_code=None, date=None) -> ValidationResult

Validate a GTFS feed.

Parameters:

  • path (str): Path to GTFS zip file or directory
  • country_code (str, optional): ISO country code (e.g., "US", "RU")
  • date (str, optional): Validation date in YYYY-MM-DD format

Returns: ValidationResult object

Example:

result = gtfs_guru.validate(
    "/path/to/gtfs.zip",
    country_code="US",
    date="2025-01-15"
)

async validate_async(path, country_code=None, date=None, on_progress=None) -> ValidationResult

Validate a GTFS feed asynchronously (non-blocking).

Parameters:

  • path (str): Path to GTFS zip file or directory
  • country_code (str, optional): ISO country code
  • date (str, optional): Validation date in YYYY-MM-DD format
  • on_progress (Callable[[ProgressInfo], None], optional): Callback for progress updates

Example:

import asyncio

async def main():
    def on_progress(info):
        print(f"{info.stage}: {info.current}/{info.total}")
        
    result = await gtfs_guru.validate_async(
        "/path/to/gtfs.zip",
        on_progress=on_progress
    )
    
asyncio.run(main())

version() -> str

Get validator version.

>>> gtfs_guru.version()
'0.9.0'

notice_codes() -> list[str]

Get list of all available notice codes.

>>> len(gtfs_guru.notice_codes())
164
>>> gtfs_guru.notice_codes()[:3]
['attribution_without_role', 'bidirectional_exit_gate', 'block_trips_with_overlapping_stop_times']

notice_schema() -> dict

Get schema for all notice types with descriptions and severity levels.

>>> schema = gtfs_guru.notice_schema()
>>> schema['missing_required_field']
{'severity': 'ERROR', 'description': '...'}

Classes

ValidationResult

Result of GTFS validation.

Attributes:

Attribute Type Description
is_valid bool True if no errors
error_count int Number of errors
warning_count int Number of warnings
info_count int Number of info notices
validation_time_seconds float Validation time in seconds
notices list[Notice] All validation notices

Methods:

# Get notices by severity
errors = result.errors()       # List[Notice]
warnings = result.warnings()   # List[Notice]
infos = result.infos()        # List[Notice]

# Filter by notice code
notices = result.by_code("missing_required_field")  # List[Notice]

# Export
result.save_json("/path/to/report.json")
result.save_html("/path/to/report.html")
json_str = result.to_json()   # str
report = result.to_dict()     # dict

Notice

A single validation notice.

Attributes:

Attribute Type Description
code str Notice code (e.g., "missing_required_field")
severity str "ERROR", "WARNING", or "INFO"
message str Human-readable message
file str | None GTFS filename
row int | None CSV row number
field str | None Field name

Methods:

# Get context field
value = notice.get("fieldName")  # Any | None

# Get all context
ctx = notice.context()  # dict[str, Any]

Examples

Basic Validation

import gtfs_guru

result = gtfs_guru.validate("/path/to/gtfs.zip")

if result.is_valid:
    print("Feed is valid!")
else:
    print(f"Found {result.error_count} errors")
    for error in result.errors():
        print(f"  - {error.code}: {error.message}")

Detailed Analysis

from collections import Counter

result = gtfs_guru.validate("/path/to/gtfs.zip")

# Count notices by code
error_counts = Counter(e.code for e in result.errors())
for code, count in error_counts.most_common(10):
    print(f"{code}: {count}")

# Find all missing required fields
for notice in result.by_code("missing_required_field"):
    file = notice.file
    field = notice.get("fieldName")
    row = notice.row
    print(f"{file}:{row} - missing {field}")

Save Reports

result = gtfs_guru.validate("/path/to/gtfs.zip")

# Save JSON report (same format as Java validator)
result.save_json("report.json")

# Save HTML report
result.save_html("report.html")

# Get as Python dict
report = result.to_dict()
summary = report["summary"]
print(f"Agencies: {summary.get('agencies', [])}")
print(f"Routes: {summary.get('routes', {}).get('count', 0)}")

Validation with Options

# Validate for specific country (affects some rules)
result = gtfs_guru.validate(
    "/path/to/gtfs.zip",
    country_code="DE"
)

# Validate as of specific date
result = gtfs_guru.validate(
    "/path/to/gtfs.zip",
    date="2025-06-01"
)

Supported Platforms

Platform Architecture Python
macOS ARM64 (M1/M2) 3.8+
macOS x86_64 (Intel) 3.8+
Windows x86_64 3.8+
Linux x86_64 3.8+

Performance

Typical validation times (compared to Java validator):

Feed Size Java Rust/Python
Small (<1MB) ~2s ~0.05s
Medium (10MB) ~10s ~0.5s
Large (100MB) ~60s ~3s

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gtfs_guru-0.9.1.tar.gz (244.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gtfs_guru-0.9.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

File details

Details for the file gtfs_guru-0.9.1.tar.gz.

File metadata

  • Download URL: gtfs_guru-0.9.1.tar.gz
  • Upload date:
  • Size: 244.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.11.5

File hashes

Hashes for gtfs_guru-0.9.1.tar.gz
Algorithm Hash digest
SHA256 7bdf72a4e5c6e5296fd47370ad8c98ab397adc93721a1b05f55d4869f015b6b7
MD5 cdeae51be149e8f6240ec6bf3f32c5f2
BLAKE2b-256 07ca81679e9913b9844ca64bce738161fd6642cb9199d0859867704a6af7e110

See more details on using hashes here.

File details

Details for the file gtfs_guru-0.9.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gtfs_guru-0.9.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ab8ae3f76e108e89c58c1928e3875644d2e87fe6414544cf849ff2569b43f250
MD5 3445bcd8afd38ccaa88635a6da4f3f48
BLAKE2b-256 9163ef70f7db5d2e8407d9330078d4c3bfb877d38f8d82828152eef499405845

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page