Skip to main content

A Python tool to convert TransXchange data into GTFS.

Project description

transx2gtfs

PyPI version build status Coverage Status DOI Gitter

transx2gtfs is a library for converting public transport data from TransXchange -format (data standard in UK) into a widely used GTFS -format that can be used with various routing engines such as OpenTripPlanner.

Note!

This package is still in a Beta-phase, so use it at your own risk. If you find an issue, you can contribute and help solving them by raising an issue.

Features

  • Reads TransXchange xml-files and converts into GTFS feed with all necessary information according the General Transit Feed Specification.
  • Works and tested against different TransXchange schemas (TfL schema and TXC 2.1)
  • Combines multiple TransXchange files into a single GTFS feed if present in the same folder.
  • Finds and reads all XML files present in ZipFiles, nested ZipFiles and unpacked directories.
  • Uses multiprocessing to parallelize the conversion process.
  • Parses bank holidays (from gov.uk) affecting transit operations at the given time span of the TransXChange feed, which are written to calendar_dates.txt.
  • Reads and updates stop information automatically from NaPTAN website.

Why yet another converter?

There are numerous TransXChange to GTFS converters written in different programming languages. However, after testing many of them, it was hard to find a tool that would:

  1. work in general (without ad-hoc modifications)
  2. parse all important information from the TransXChange according GTFS specification.
  3. work with different TransXChange schema versions
  4. be well maintained
  5. be easy to use in all operating systems
  6. include appropriate tests (crucial for maintenance).

Hence, this Python package was written which aims at meeting the aforementioned requirements. It's not the fastest library out there (written in Python) but multiprocessing gives a bit of boost if having a decent computer with multiple cores.

Install

The package is available at PyPi and you can install it with:

$ pip install transx2gtfs

Library works and is being tested with Python versions 3.6, 3.7 and 3.8.

If you don't know how to install Python, you can take a look for example these materials.

Requirements

transx2gtfs has following dependencies (tested against the latest versions available for Python 3.6, 3.7 and 3.8):

  • untangle
  • pandas
  • pyproj

Basic usage

After you have installed the library you can use it in a following manner:

>>> import transx2gtfs
>>> data_dir_for_transxchange_files = "data/my_transxchange_files"
>>> output_path = "data/my_converted_gtfs.zip"
>>> transx2gtfs.convert(data_dir_for_transxchange_files, output_path)

There are a few parameters that you can adjust:

input_filepath : str
    File path to data directory or a ZipFile containing one or multiple TransXchange .xml files.
    Also nested ZipFiles are supported (i.e. a ZipFile with ZipFile(s) containing .xml files.)

output_filepath : str
    Full filepath to the output GTFS zip-file, e.g. '/home/myuser/data/my_gtfs.zip'

append_to_existing : bool (default is False)
    Flag for appending to existing gtfs-database. This might be useful if you have
    TransXchange .xml files distributed into multiple directories (e.g. separate files for
    train data, tube data and bus data) and you want to merge all those datasets into a single
    GTFS feed.

worker_cnt : int
    Number of workers to distribute the conversion process. By default the number of CPUs is used.

file_size_limit : int
    File size limit (in megabytes) can be used to skip larger-than-memory XML-files (should not happen).

Output

After you have successfully converted the TransXchange into GTFS, you can start doing multimodal routing with your favourite routing engine such as OpenTripPlanner:

OTP_example_in_London

Citation

If you use this tool for research purposes, we encourage you to cite this work:

Developers

  • Henrikki Tenkanen, University College London

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transx2gtfs-0.4.0.tar.gz (86.7 kB view details)

Uploaded Source

File details

Details for the file transx2gtfs-0.4.0.tar.gz.

File metadata

  • Download URL: transx2gtfs-0.4.0.tar.gz
  • Upload date:
  • Size: 86.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/0.6.0 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.3

File hashes

Hashes for transx2gtfs-0.4.0.tar.gz
Algorithm Hash digest
SHA256 cbc71cf6d3eacd331820792e52b8ac9161e717fb11cff290f2cdbdf43a8c4a9b
MD5 dfa59e04a2ab997cc11b9c383a70d4e2
BLAKE2b-256 6a15aff182de87b7fe9103ff17f624578cd9e8e74af2121f2e9f4cb458735512

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page