Skip to main content

A Python package to process raw GPS data of public transit and transform to GTFS format.

Project description

gps2gtfs: A Python package to process raw GPS data of public transit and transform to GTFS format

project research

  • Project Lead(s) / Mentor(s)

    1. Dr. T. Uthayasanker
  • Contributor(s)

    1. R. Shiveswarran
    2. S. Gopinath
    3. S. Kajanan
    4. A. Kesavi

Description

The "gps2gtfs" Python package provides a streamlined solution for preprocessing GPS (Global Positioning System) raw data and converting it into GTFS (General Transit Feed Specification) data format. Leveraging the power of DataFrame and GeoDataFrame with parallelization, this package offers efficient methods to extract essential trip details from raw GPS data. These details encompass trip sequences, stop information, arrival time to stops, departure time from stops, dwell time at stops, travel durations, running times between stops, and the seamless transformation into GTFS data structure. Currently, "gps2gtfs" handles static (schedule) trip data at heterogeneous traffic condition, with the potential for future expansion to accommodate dynamic real-time trip data. Furthermore, in the future, a visualization package can be seamlessly integrated with existing packages.

Keywords: GTFS, GPS, Travel Time, Public Transit, Heterogeneous Traffic Condition, ITS (Intelligent Transportation System)

Architecture

The "gps2gtfs" framework is developed using Python 3, with a thoughtfully designed package structure that ensures minimal interdependence among the main packages. The core components encompass distinct packages, each serving a specific purpose: data_field, load_data, preprocessing, trip, stop, reporting, pipeline, and utility. Users are expected to provide input for a single route, and the system facilitates the inclusion of multiple routes into the pipeline through user-driven iteration.

The package is structured into eight(8) primary packages:

  1. data_field: This package is responsible for managing column names for user input and a predefined set of output columns. The fields provided by the user should be a superset of the defined fields within this package.
  2. load_data: This package handles the loading of necessary data into the pipeline.
  3. preprocessing: The preprocessing package is designed to clean the data loaded from the previous step.
  4. trip: The trip package focuses on extracting trips and generating associated features.
  5. stop: Within the stop package, the identification of stops and the creation of related features take place.
  6. reporting: This package is responsible for generating outputs containing the extracted information.
  7. pipeline: This package contains functionality to execute the trip extraction pipeline and the trip & stop extraction pipeline.
  8. utility: The utility module provides support for various utility functions, including input/output operations, data conversions, and logging.

Architecture digram of gps2gfts

img1

How the gps2gfts works

img2

Quick Example

It is essential to provide input files in CSV format for proper functionality. Additionally, the utilization of this package requires the presence of a main thread.

1. Pipeline to extract trips details

from gps2gtfs.pipeline.trip import run


if __name__ == "__main__":
    raw_gps_data_path = "path/to/raw_gps_data/csv"
    trip_terminals_data_path = "path/to/trip_terminals_data/csv"
    terminals_buffer_radius = 100

    run(
        raw_gps_data_path,
        trip_terminals_data_path,
        terminals_buffer_radius,
    )

2. Pipeline to extract trips and stops details

from gps2gtfs.pipeline.trip_stop import run


if __name__ == "__main__":
    raw_gps_data_path = "path/to/raw_gps_data/csv"
    trip_terminals_data_path = "path/to/trip_terminals_data/csv"
    stops_data_path = "path/to/stops_data/csv"
    terminals_buffer_radius = 100
    stops_buffer_radius = 50
    stops_extended_buffer_radius = 100

    run(
        raw_gps_data_path,
        trip_terminals_data_path,
        stops_data_path,
        terminals_buffer_radius,
        stops_buffer_radius,
        stops_extended_buffer_radius,
    )

License

MIT License. You can see here.

Code of Conduct

Please read our code of conduct document here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

gps2gtfs-0.1.0-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file gps2gtfs-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: gps2gtfs-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for gps2gtfs-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 997d4c80812f57e52e74373ec5b1d190e4de53d731228fd1e2079f746b679744
MD5 9354a4977e73882df3cd0dcfdd9b088c
BLAKE2b-256 3de0b63f58e2cb8eed0b61ea4fbab922abd7816f45f894cc839f9f7df0dde64c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page