Skip to main content

Fast GTFS validator written in Rust

Project description

GTFS Validator Python

High-performance GTFS feed validator with Python bindings. Written in Rust, exposed via PyO3.

Installation

pip install gtfs-validator

From Source

pip install maturin
maturin build --release
pip install target/wheels/gtfs_validator-*.whl

Quick Start

import gtfs_validator

# Validate a GTFS feed
result = gtfs_validator.validate("/path/to/gtfs.zip")

print(f"Valid: {result.is_valid}")
print(f"Errors: {result.error_count}")
print(f"Warnings: {result.warning_count}")

# Print errors
for error in result.errors():
    print(f"{error.code}: {error.message}")

API Reference

Functions

validate(path, country_code=None, date=None) -> ValidationResult

Validate a GTFS feed.

Parameters:

  • path (str): Path to GTFS zip file or directory
  • country_code (str, optional): ISO country code (e.g., "US", "RU")
  • date (str, optional): Validation date in YYYY-MM-DD format

Returns: ValidationResult object

Example:

result = gtfs_validator.validate(
    "/path/to/gtfs.zip",
    country_code="US",
    date="2025-01-15"
)

async validate_async(path, country_code=None, date=None, on_progress=None) -> ValidationResult

Validate a GTFS feed asynchronously (non-blocking).

Parameters:

  • path (str): Path to GTFS zip file or directory
  • country_code (str, optional): ISO country code
  • date (str, optional): Validation date in YYYY-MM-DD format
  • on_progress (Callable[[ProgressInfo], None], optional): Callback for progress updates

Example:

import asyncio

async def main():
    def on_progress(info):
        print(f"{info.stage}: {info.current}/{info.total}")
        
    result = await gtfs_validator.validate_async(
        "/path/to/gtfs.zip",
        on_progress=on_progress
    )
    
asyncio.run(main())

version() -> str

Get validator version.

>>> gtfs_validator.version()
'0.1.0'

notice_codes() -> list[str]

Get list of all available notice codes.

>>> len(gtfs_validator.notice_codes())
164
>>> gtfs_validator.notice_codes()[:3]
['attribution_without_role', 'bidirectional_exit_gate', 'block_trips_with_overlapping_stop_times']

notice_schema() -> dict

Get schema for all notice types with descriptions and severity levels.

>>> schema = gtfs_validator.notice_schema()
>>> schema['missing_required_field']
{'severity': 'ERROR', 'description': '...'}

Classes

ValidationResult

Result of GTFS validation.

Attributes:

Attribute Type Description
is_valid bool True if no errors
error_count int Number of errors
warning_count int Number of warnings
info_count int Number of info notices
validation_time_seconds float Validation time in seconds
notices list[Notice] All validation notices

Methods:

# Get notices by severity
errors = result.errors()       # List[Notice]
warnings = result.warnings()   # List[Notice]
infos = result.infos()        # List[Notice]

# Filter by notice code
notices = result.by_code("missing_required_field")  # List[Notice]

# Export
result.save_json("/path/to/report.json")
result.save_html("/path/to/report.html")
json_str = result.to_json()   # str
report = result.to_dict()     # dict

Notice

A single validation notice.

Attributes:

Attribute Type Description
code str Notice code (e.g., "missing_required_field")
severity str "ERROR", "WARNING", or "INFO"
message str Human-readable message
file str | None GTFS filename
row int | None CSV row number
field str | None Field name

Methods:

# Get context field
value = notice.get("fieldName")  # Any | None

# Get all context
ctx = notice.context()  # dict[str, Any]

Examples

Basic Validation

import gtfs_validator

result = gtfs_validator.validate("/path/to/gtfs.zip")

if result.is_valid:
    print("Feed is valid!")
else:
    print(f"Found {result.error_count} errors")
    for error in result.errors():
        print(f"  - {error.code}: {error.message}")

Detailed Analysis

from collections import Counter

result = gtfs_validator.validate("/path/to/gtfs.zip")

# Count notices by code
error_counts = Counter(e.code for e in result.errors())
for code, count in error_counts.most_common(10):
    print(f"{code}: {count}")

# Find all missing required fields
for notice in result.by_code("missing_required_field"):
    file = notice.file
    field = notice.get("fieldName")
    row = notice.row
    print(f"{file}:{row} - missing {field}")

Save Reports

result = gtfs_validator.validate("/path/to/gtfs.zip")

# Save JSON report (same format as Java validator)
result.save_json("report.json")

# Save HTML report
result.save_html("report.html")

# Get as Python dict
report = result.to_dict()
summary = report["summary"]
print(f"Agencies: {summary.get('agencies', [])}")
print(f"Routes: {summary.get('routes', {}).get('count', 0)}")

Validation with Options

# Validate for specific country (affects some rules)
result = gtfs_validator.validate(
    "/path/to/gtfs.zip",
    country_code="DE"
)

# Validate as of specific date
result = gtfs_validator.validate(
    "/path/to/gtfs.zip",
    date="2025-06-01"
)

Supported Platforms

Platform Architecture Python
macOS ARM64 (M1/M2) 3.8+
macOS x86_64 (Intel) 3.8+
Windows x86_64 3.8+
Linux x86_64 3.8+

Performance

Typical validation times (compared to Java validator):

Feed Size Java Rust/Python
Small (<1MB) ~2s ~0.05s
Medium (10MB) ~10s ~0.5s
Large (100MB) ~60s ~3s

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gtfs_guru-0.1.1.tar.gz (237.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gtfs_guru-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

File details

Details for the file gtfs_guru-0.1.1.tar.gz.

File metadata

  • Download URL: gtfs_guru-0.1.1.tar.gz
  • Upload date:
  • Size: 237.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.11.5

File hashes

Hashes for gtfs_guru-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7f4e0c7447ff0aaebf3861147c595092ad055f64ac1ee005ebe9a6053def5167
MD5 cf8a5f226b4da65fa1c852f0829317b4
BLAKE2b-256 9d42016c3fdf50f3d78ab5c88b7acb84051298192177321f2d3f989701771af0

See more details on using hashes here.

File details

Details for the file gtfs_guru-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gtfs_guru-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 28c5b8e206d231916bc793d09c362759bed0c3a453b7262170e063b382b0b213
MD5 5c93a62cc90702c17a2e14dbc024d668
BLAKE2b-256 824317ccdb1301b578470f58ac17ae4f7a3b0d74ea30ac967db4f447a1d9009e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page