Skip to main content

Fast GTFS validator written in Rust

Project description

GTFS Validator Python

High-performance GTFS feed validator with Python bindings. Written in Rust, exposed via PyO3.

Installation

pip install gtfs-validator

From Source

pip install maturin
maturin build --release
pip install target/wheels/gtfs_validator-*.whl

Quick Start

import gtfs_validator

# Validate a GTFS feed
result = gtfs_validator.validate("/path/to/gtfs.zip")

print(f"Valid: {result.is_valid}")
print(f"Errors: {result.error_count}")
print(f"Warnings: {result.warning_count}")

# Print errors
for error in result.errors():
    print(f"{error.code}: {error.message}")

API Reference

Functions

validate(path, country_code=None, date=None) -> ValidationResult

Validate a GTFS feed.

Parameters:

  • path (str): Path to GTFS zip file or directory
  • country_code (str, optional): ISO country code (e.g., "US", "RU")
  • date (str, optional): Validation date in YYYY-MM-DD format

Returns: ValidationResult object

Example:

result = gtfs_validator.validate(
    "/path/to/gtfs.zip",
    country_code="US",
    date="2025-01-15"
)

async validate_async(path, country_code=None, date=None, on_progress=None) -> ValidationResult

Validate a GTFS feed asynchronously (non-blocking).

Parameters:

  • path (str): Path to GTFS zip file or directory
  • country_code (str, optional): ISO country code
  • date (str, optional): Validation date in YYYY-MM-DD format
  • on_progress (Callable[[ProgressInfo], None], optional): Callback for progress updates

Example:

import asyncio

async def main():
    def on_progress(info):
        print(f"{info.stage}: {info.current}/{info.total}")
        
    result = await gtfs_validator.validate_async(
        "/path/to/gtfs.zip",
        on_progress=on_progress
    )
    
asyncio.run(main())

version() -> str

Get validator version.

>>> gtfs_validator.version()
'0.1.0'

notice_codes() -> list[str]

Get list of all available notice codes.

>>> len(gtfs_validator.notice_codes())
164
>>> gtfs_validator.notice_codes()[:3]
['attribution_without_role', 'bidirectional_exit_gate', 'block_trips_with_overlapping_stop_times']

notice_schema() -> dict

Get schema for all notice types with descriptions and severity levels.

>>> schema = gtfs_validator.notice_schema()
>>> schema['missing_required_field']
{'severity': 'ERROR', 'description': '...'}

Classes

ValidationResult

Result of GTFS validation.

Attributes:

Attribute Type Description
is_valid bool True if no errors
error_count int Number of errors
warning_count int Number of warnings
info_count int Number of info notices
validation_time_seconds float Validation time in seconds
notices list[Notice] All validation notices

Methods:

# Get notices by severity
errors = result.errors()       # List[Notice]
warnings = result.warnings()   # List[Notice]
infos = result.infos()        # List[Notice]

# Filter by notice code
notices = result.by_code("missing_required_field")  # List[Notice]

# Export
result.save_json("/path/to/report.json")
result.save_html("/path/to/report.html")
json_str = result.to_json()   # str
report = result.to_dict()     # dict

Notice

A single validation notice.

Attributes:

Attribute Type Description
code str Notice code (e.g., "missing_required_field")
severity str "ERROR", "WARNING", or "INFO"
message str Human-readable message
file str | None GTFS filename
row int | None CSV row number
field str | None Field name

Methods:

# Get context field
value = notice.get("fieldName")  # Any | None

# Get all context
ctx = notice.context()  # dict[str, Any]

Examples

Basic Validation

import gtfs_validator

result = gtfs_validator.validate("/path/to/gtfs.zip")

if result.is_valid:
    print("Feed is valid!")
else:
    print(f"Found {result.error_count} errors")
    for error in result.errors():
        print(f"  - {error.code}: {error.message}")

Detailed Analysis

from collections import Counter

result = gtfs_validator.validate("/path/to/gtfs.zip")

# Count notices by code
error_counts = Counter(e.code for e in result.errors())
for code, count in error_counts.most_common(10):
    print(f"{code}: {count}")

# Find all missing required fields
for notice in result.by_code("missing_required_field"):
    file = notice.file
    field = notice.get("fieldName")
    row = notice.row
    print(f"{file}:{row} - missing {field}")

Save Reports

result = gtfs_validator.validate("/path/to/gtfs.zip")

# Save JSON report (same format as Java validator)
result.save_json("report.json")

# Save HTML report
result.save_html("report.html")

# Get as Python dict
report = result.to_dict()
summary = report["summary"]
print(f"Agencies: {summary.get('agencies', [])}")
print(f"Routes: {summary.get('routes', {}).get('count', 0)}")

Validation with Options

# Validate for specific country (affects some rules)
result = gtfs_validator.validate(
    "/path/to/gtfs.zip",
    country_code="DE"
)

# Validate as of specific date
result = gtfs_validator.validate(
    "/path/to/gtfs.zip",
    date="2025-06-01"
)

Supported Platforms

Platform Architecture Python
macOS ARM64 (M1/M2) 3.8+
macOS x86_64 (Intel) 3.8+
Windows x86_64 3.8+
Linux x86_64 3.8+

Performance

Typical validation times (compared to Java validator):

Feed Size Java Rust/Python
Small (<1MB) ~2s ~0.05s
Medium (10MB) ~10s ~0.5s
Large (100MB) ~60s ~3s

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gtfs_guru-0.1.0.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gtfs_guru-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

File details

Details for the file gtfs_guru-0.1.0.tar.gz.

File metadata

  • Download URL: gtfs_guru-0.1.0.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.10.2

File hashes

Hashes for gtfs_guru-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c7e04a3af5411f00cdffccdccebde2b6d95b957587648270a6f76dfc0698283c
MD5 9469caee6d2938ead37866beaae20692
BLAKE2b-256 ce87a8fcd4938ef8bdced3a84b709cf954b4981d2559f054b2613b87c47987e3

See more details on using hashes here.

File details

Details for the file gtfs_guru-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gtfs_guru-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2f187dc7d3d994f560332103b690a76452e5a82c2dae36b60e222cdd5acade7f
MD5 62657c2ff4508459ed785f409c999d60
BLAKE2b-256 1c71bcf99d7e1dea5b683195fdaa45413a789b1017e65509006a5154f8dd268d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page