Skip to main content

No project description provided

Project description

Impuls

GitHub | Documentation | Issue Tracker | PyPI

Impuls is a framework for processing static public transportation data. The internal model used is very close to GTFS.

The core entity for processing is called a pipeline, which is composed of multiple tasks that do the actual processing work.

The data is stored in an sqlite3 database with a very lightweight wrapper to map Impuls's internal model into SQL and GTFS.

Impuls has first-class support for pulling in data from external sources, using its resource mechanism. Resources are cached before the data is processed, which saves bandwidth if some of the input data has not changed, or even allows to stop the processing early if none of the resources have been modified.

A module for dealing with versioned, or multi-file sources is also provided. It allows for easy and very flexible processing of schedules provided in discrete versions into a single coherent file.

Installation and compilation

Impuls is mainly written in python, however a performance-critical part of this library is written in zig and bundled alongside the shared library. To compile and install the library, first ensure that zig is installed, then run the following, preferably inside of a virtual environment:

Impuls is mainly written in python, however a performance-critical part of this library is written in zig and bundled alongside the shared library. To install the library run the following, preferably inside of a virtual environment:

pip install impuls

Pre-built binaries are available for most platforms. To build from source zig needs to be installed.

The LoadBusManMDB task additionally requires mdbtools to be installed. This package is available in most package managers.

Examples

See https://impuls.readthedocs.io/en/stable/example.html for a tutorial and a more detailed walkthrough over Impuls features.

The examples directory contains 4 example configurations, processing data from four sources into a GTFS file. If you wish to run them, consult with the Development section of the readme to set up the environment correctly.

Kraków

Kraków provides decent GTFS files on https://gtfs.ztp.krakow.pl. The example pipeline removes unnecessary, confusing trip data and fixes several user-facing strings.

Run with python -m examples.krakow tram or python -m examples.krakow bus. The result GTFS will be created in _workspace_krakow/krakow.tram.out.zip or _workspace_krakow/krakow.bus.out.zip, accordingly.

PKP IC (PKP Intercity)

PKP Intercity provides their schedules in a single CSV table at ftp://ftps.intercity.pl. Unfortunately, the source data is not openly available. One needs to email PKP Intercity through the contact provided in the Polish MMTIS NAP in order to get the credentials.

The Pipeline starts by manually creating an Agency, loading the CSV data, pulling station data from https://github.com/MKuranowski/PLRailMap, adjusting some user-facing data - most importantly extracting trip legs operated by buses.

Run with python -m examples.pkpic FTP_USERNAME FTP_PASSWORD. The result GTFS will be created at _workspace_pkpic/pkpic.zip

Radom

MZDiK Radom provides schedules in a MDB database at http://mzdik.pl/index.php?id=145. It is the first example to use the multi-file pipeline support, as the source files are published in discrete versions.

Multi-file pipelines consist of four distinct parts:

  • an intermediate provider, which figures out the relevant input ("intermediate") feeds
  • a intermediate tasks factory, which returns the tasks necessary to load an intermediate feed into the SQLite database
  • a final tasks factory, which returns the tasks to perform after merging intermediate feeds
  • any additional resources, required by the intermediate or final tasks

Caching is even more involved - not only the input feeds are kept across runs, but the databases resulting from running intermediate pipelines are also preserved. If 3 of 4 feeds requested by the intermediate provider have already been processed - the intermediate pipeline will run only for the single new file, but the final (merging) pipeline will be run on all of the 4 feeds.

The intermediate provider for Radom scrapes the aforementioned website to find available databases.

Pipeline for processing intermediate feeds is a bit more complex: it involved loading the MDB database, cleaning up the data (removing virtual stops, generating and cleaning calendars) and pulling stop positions from http://rkm.mzdik.radom.pl/.

The final pipeline simply dumps the merged dataset into a GTFS.

Run with python -m examples.radom, the result GTFS will be created at _workspace_radom/radom.zip.

Warsaw

Warsaw is another city which requires multi-file pipelines. ZTM Warsaw publishes distinct input files for pretty much every other day at ftp://rozklady.ztm.waw.pl. The input datasets are in a completely custom text format, requiring quite involved parsing. More details are available at https://www.ztm.waw.pl/pliki-do-pobrania/dane-rozkladowe/ (in Polish).

The intermediate provider picks out relevant files from the aforementioned FTP server.

Processing of intermediate feeds starts with the import of the text file into the database. Rather uniquely, this step also prettifies stop names - as this would be hard to do in a separate task, due to the presence of indicators (two-digit codes uniquely identifying a stop around an intersection) in the name field. The pipeline continues by adding version meta-data, merging railway stations into a single stops.txt entry (ZTM separates railway departures into virtual stops) and attribute prettifying (namely trip_headsign and stop_lat,stop_lon - not all stops have positions in the input file). Last steps involve cleaning up unused entities from the database.

The final pipeline simply dumps the merged dataset into a GTFS, yet again.

Additional data for stop positions and edge-cases for prettifying stop names comes from https://github.com/MKuranowski/WarsawGTFS/blob/master/data_curated/stop_names.json.

Run with python -m examples.warsaw, the result GTFS will be created at _workspace_warsaw/warsaw.zip.

License

Impuls is distributed under GNU GPL v3 (or any later version).

© Copyright 2022-2024 Mikołaj Kuranowski

Impuls is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

Impuls is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with Impuls. If not, see http://www.gnu.org/licenses/.

Impuls source code and pre-built binaries come with sqlite3, which is placed in the public domain.

Development

Impuls uses meson-python. The project layout is quite unorthodox, as Impuls in neither a pure-python module, nor a project with a bog-standard C/C++ extension. Instead, the zig code is compiled into a shared library which is bundled alongside the python module.

Zig allows super easy cross-compilation, while using a shared library allows a single wheel to be used across multiple python versions and implementations.

Development requires python, zig and mdbtools (usually all 3 will be available in your package manager repositories) to be installed. To set up the environment on Linux, run:

$ python -m venv --upgrade-deps .venv
$ . .venv/bin/activate
$ pip install -Ur requirements.dev.txt
$ pip install --no-build-isolation -Cbuild-dir=builddir --editable .
$ ln -s ../../builddir/libextern.so impuls/extern

On MacOS, change the shared library file extension to .dylib. On Windows, change the extension of the shared library to .dll.

To run python tests, simply execute pytest. To run zig tests, run meson test -C builddir.

To run the examples, install their dependencies first (pip install -Ur requirements.examples.txt), then execute the example module, e.g. python -m examples.krakow.

meson-python will automatically recompile the zig library whenever an editable impuls install is imported; set the MESONPY_EDITABLE_VERBOSE environment variable to 1 to see meson logs for build details.

By default, the extern zig library will be built in debug mode. To change that, run meson configure --buildtype=debugoptimized builddir (buildtype can also be set to debug or release). To recompile the library, run meson compile -C builddir.

Unfortunately, meson-python requires all python and zig source files in meson.build. Python files need to be listed for packaging to work, while zig source files need to be listed for the build backend to properly detect whether libextern needs to be recompiled.

Building wheels

Zig has been chosen for its excellent cross-compilation support. Thanks to this, building all wheels for a release does not require tools like cibuildwheel, virtual machines, or even any containers. As long as Zig is installed, all wheels can be build on that machine.

Before building wheels, install a few extra dependencies in the virtual environment: pip install -U build wheel.

To build the wheels, simply run python build_wheels.py.

See python build_wheels.py --help for all available options. To debug failed builds, run python build_wheels.py --verbose --jobs 1 FAILED_CONFIG_NAME.

See CONFIGURATION in build_wheels.py for available configurations.

To build the source distribution, run python -m build -so dist.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

impuls-1.1.2.tar.gz (4.6 MB view details)

Uploaded Source

Built Distributions

impuls-1.1.2-py3-none-win_arm64.whl (772.2 kB view details)

Uploaded Python 3 Windows ARM64

impuls-1.1.2-py3-none-win_amd64.whl (806.6 kB view details)

Uploaded Python 3 Windows x86-64

impuls-1.1.2-py3-none-musllinux_1_1_x86_64.whl (821.6 kB view details)

Uploaded Python 3 musllinux: musl 1.1+ x86-64

impuls-1.1.2-py3-none-musllinux_1_1_aarch64.whl (855.1 kB view details)

Uploaded Python 3 musllinux: musl 1.1+ ARM64

impuls-1.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (822.0 kB view details)

Uploaded Python 3 manylinux: glibc 2.17+ x86-64

impuls-1.1.2-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (855.5 kB view details)

Uploaded Python 3 manylinux: glibc 2.17+ ARM64

impuls-1.1.2-py3-none-macosx_11_0_x86_64.whl (820.9 kB view details)

Uploaded Python 3 macOS 11.0+ x86-64

impuls-1.1.2-py3-none-macosx_11_0_arm64.whl (802.4 kB view details)

Uploaded Python 3 macOS 11.0+ ARM64

File details

Details for the file impuls-1.1.2.tar.gz.

File metadata

  • Download URL: impuls-1.1.2.tar.gz
  • Upload date:
  • Size: 4.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for impuls-1.1.2.tar.gz
Algorithm Hash digest
SHA256 3328e0682f0d1cce506403afffaa584ba37650e89a271afd0b6016816ab468d1
MD5 9150b91b566ebff9cc39880d31950c98
BLAKE2b-256 ecb0a6cff455ec37133468c91703db00c6643fb0d75f68f721e06793e8714c2b

See more details on using hashes here.

File details

Details for the file impuls-1.1.2-py3-none-win_arm64.whl.

File metadata

  • Download URL: impuls-1.1.2-py3-none-win_arm64.whl
  • Upload date:
  • Size: 772.2 kB
  • Tags: Python 3, Windows ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for impuls-1.1.2-py3-none-win_arm64.whl
Algorithm Hash digest
SHA256 dd95b836cd58fc0b48e2e3bbeaeeb7b3002356b69c890e3db5f1ef35a75275b8
MD5 e8c13a9768a15bf7ad9f8bed1e249ce5
BLAKE2b-256 b41df6f49878766556aada99e83fd2b65adb46f9094b8a11a6147788d8883fb1

See more details on using hashes here.

File details

Details for the file impuls-1.1.2-py3-none-win_amd64.whl.

File metadata

  • Download URL: impuls-1.1.2-py3-none-win_amd64.whl
  • Upload date:
  • Size: 806.6 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for impuls-1.1.2-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 032e4a4e1197e5662ddb26e67657c685d4b4759aa91f3275a01c09b103ee96a5
MD5 05bc024fe54fce460fa94d389a785765
BLAKE2b-256 9cea6c98107f93f5322a3fe7758d1860b31dbbe826846f467750381a0e84f8d5

See more details on using hashes here.

File details

Details for the file impuls-1.1.2-py3-none-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for impuls-1.1.2-py3-none-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 ec81d2afed8bf2bb61dbda761b2bd86c83a3922741f73eec5742e67f7c8b4341
MD5 704e25a0ac79838b4b7e8269ac004406
BLAKE2b-256 079f540045580b483040a29d5d95fd4a71311e2b6e722964d600f0caff73a216

See more details on using hashes here.

File details

Details for the file impuls-1.1.2-py3-none-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for impuls-1.1.2-py3-none-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 e48a19415c6a2488b23410ffcfc2e37e8d789c9a41856370d66a0e11ec1b71cb
MD5 80096f07c49e402dc180ffdc38261e93
BLAKE2b-256 01649aebc29a012ede6558d19810b796c4cf40077dd9d7e104bdf26d57409a15

See more details on using hashes here.

File details

Details for the file impuls-1.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for impuls-1.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 6364d48d3a2ba6a9a847efe1a0646d95c47d2b519ea739391bd83b53eef9454c
MD5 cf9891721e3a0695b8cfd4cda2e6b271
BLAKE2b-256 89b812545091a255541a9edffb66b438a512fe5e8c37b0d5e97bd0f89dd0c9ac

See more details on using hashes here.

File details

Details for the file impuls-1.1.2-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for impuls-1.1.2-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 878f2056580c5ece0a2ad4555069047f53b60ed1e74e071d0140163f9bbb4ae7
MD5 1ed50a421db0bed6bab0aa9123bf849d
BLAKE2b-256 4fe5ed0faf6d81de814166ba09f00e2ea600bed4d10a246741fb7a6cc10bf634

See more details on using hashes here.

File details

Details for the file impuls-1.1.2-py3-none-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for impuls-1.1.2-py3-none-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 6a45b424e9ac02f1c1104b44d8c22d10e20d608a7aaeab64df78831c06b1456a
MD5 355a95f15b9215d3eec5cd1fe2f23bd7
BLAKE2b-256 18665e2cc17727fd210bda4695d76063bfa2c9e8c27ba38fd4aa2aed2526619a

See more details on using hashes here.

File details

Details for the file impuls-1.1.2-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for impuls-1.1.2-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ff23dbc6340c01727b98a041854895d97379331de1ca151b6fe8d2a58d15a944
MD5 f60aa1c6bcb7464ce0b5d4b1c9deb8dc
BLAKE2b-256 b714005ddd1d8721f8120e26da39c89e0f608979b398860bdcd0447539b6df38

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page