Skip to main content

A Python tool to convert TransXchange data into GTFS.

Project description

transx2gtfs

PyPI version build status Coverage Status DOI Gitter

transx2gtfs is a library for converting public transport data from TransXchange -format (data standard in UK) into a widely used GTFS -format that can be used with various routing engines such as OpenTripPlanner.

Note!

This package is still in a Beta-phase, so use it at your own risk. If you find an issue, you can contribute and help solving them by raising an issue.

Features

  • Reads TransXchange xml-files and converts into GTFS feed with all necessary information according the General Transit Feed Specification.
  • Works and tested against different TransXchange schemas (TfL schema and TXC 2.1)
  • Combines multiple TransXchange files into a single GTFS feed if present in the same folder.
  • Finds and reads all XML files present in ZipFiles, nested ZipFiles and unpacked directories.
  • Uses multiprocessing to parallelize the conversion process.
  • Parses bank holidays (from gov.uk) affecting transit operations at the given time span of the TransXChange feed, which are written to calendar_dates.txt.
  • Reads and updates stop information automatically from NaPTAN website.

Why yet another converter?

There are numerous TransXChange to GTFS converters written in different programming languages. However, after testing many of them, it was hard to find a tool that would:

  1. work in general (without ad-hoc modifications)
  2. parse all important information from the TransXChange according GTFS specification.
  3. work with different TransXChange schema versions
  4. be well maintained
  5. be easy to use in all operating systems
  6. include appropriate tests (crucial for maintenance).

Hence, this Python package was written which aims at meeting the aforementioned requirements. It's not the fastest library out there (written in Python) but multiprocessing gives a bit of boost if having a decent computer with multiple cores.

Install

The package is available at PyPi and you can install it with:

$ pip install transx2gtfs

Library works and is being tested with Python versions 3.6, 3.7 and 3.8.

If you don't know how to install Python, you can take a look for example these materials.

Requirements

transx2gtfs has following dependencies (tested against the latest versions available for Python 3.6, 3.7 and 3.8):

  • untangle
  • pandas
  • pyproj

Basic usage

After you have installed the library you can use it in a following manner:

>>> import transx2gtfs
>>> data_dir_for_transxchange_files = "data/my_transxchange_files"
>>> output_path = "data/my_converted_gtfs.zip"
>>> transx2gtfs.convert(data_dir_for_transxchange_files, output_path)

There are a few parameters that you can adjust:

input_filepath : str
    File path to data directory or a ZipFile containing one or multiple TransXchange .xml files.
    Also nested ZipFiles are supported (i.e. a ZipFile with ZipFile(s) containing .xml files.)

output_filepath : str
    Full filepath to the output GTFS zip-file, e.g. '/home/myuser/data/my_gtfs.zip'

append_to_existing : bool (default is False)
    Flag for appending to existing gtfs-database. This might be useful if you have
    TransXchange .xml files distributed into multiple directories (e.g. separate files for
    train data, tube data and bus data) and you want to merge all those datasets into a single
    GTFS feed.

worker_cnt : int
    Number of workers to distribute the conversion process. By default the number of CPUs is used.

file_size_limit : int
    File size limit (in megabytes) can be used to skip larger-than-memory XML-files (should not happen).

Output

After you have successfully converted the TransXchange into GTFS, you can start doing multimodal routing with your favourite routing engine such as OpenTripPlanner:

OTP_example_in_London

Citation

If you use this tool for research purposes, we encourage you to cite this work:

Developers

  • Henrikki Tenkanen, University College London

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transx2gtfs-0.4.1.tar.gz (86.7 kB view details)

Uploaded Source

File details

Details for the file transx2gtfs-0.4.1.tar.gz.

File metadata

  • Download URL: transx2gtfs-0.4.1.tar.gz
  • Upload date:
  • Size: 86.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/0.6.0 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.3

File hashes

Hashes for transx2gtfs-0.4.1.tar.gz
Algorithm Hash digest
SHA256 23eb656f0c56d550de6a35337a3994aac2d6effd5c9b40351a3fdc46fbe721c8
MD5 69b955f89a9543074bfc93fec0426059
BLAKE2b-256 658a2baacc98e653014abe1f788b932cca3be32314a7c53cd9aefbbbe66400f6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page