Partridge is python library for working with GTFS feeds using pandas DataFrames.
Project description
=========
Partridge
=========
.. image:: https://img.shields.io/pypi/v/partridge.svg
:target: https://pypi.python.org/pypi/partridge
.. image:: https://img.shields.io/travis/remix/partridge.svg
:target: https://travis-ci.org/remix/partridge
Partridge is python library for working with
`GTFS <https://developers.google.com/transit/gtfs/>`__ feeds using
`pandas <https://pandas.pydata.org/>`__ DataFrames.
The implementation of Partridge is heavily influenced by our experience
at `Remix <https://www.remix.com/>`__ ingesting, analyzing, and
debugging thousands of GTFS feeds from hundreds of agencies.
At the core of Partridge is a dependency graph rooted at ``trips.txt``.
When reading the contents of a feed, disconnected data is pruned away
according to this graph. The root node can optionally be filtered to
create a view of the feed specific to your needs. It's most common to
filter a feed down to specific dates (``service_id``), routes
(``route_id``), or both.
.. figure:: dependency-graph.png
:alt: dependency graph
Usage
-----
.. code:: python
import datetime
import partridge as ptg
path = 'path/to/sfmta-2017-08-22.zip'
service_ids_by_date = ptg.read_service_ids_by_date(path)
feed = ptg.feed(path, view={
'trips.txt': {
'service_id': service_ids_by_date[datetime.date(2017, 9, 25)],
'route_id': '12300', # 18-46TH AVENUE
},
})
assert set(feed.trips.service_id) == service_ids_by_date[datetime.date(2017, 9, 25)]
assert list(feed.routes.route_id) == ['12300']
# Buses running the 18 - 46th Ave line use 88 stops (on September 25, 2017, at least).
assert len(feed.stops) == 88
Features
--------
- Surprisingly fast :)
- Load only what you need into memory
- Built-in support for resolving calendar days
- Built on pandas DataFrames
- Easily extended to support fields and files outside the official spec
(TODO: document this)
- Handle nested folders and bad data in zips
- Predictable type conversions, by default
Installation
------------
.. code:: console
pip install partridge
Thank You
---------
I hope you find this library useful. If you have suggestions for
improving Partridge, please open an `issue on
GitHub <https://github.com/remix/partridge/issues>`__.
=======
History
=======
0.3.0 (2017-10-12)
===================
* Fix service date resolution for raw_feed. Previously raw_feed considered all days of the week from calendar.txt to be active regardless of 0/1 value.
0.2.0 (2017-09-30)
===================
* Add missing edge from fare_rules.txt to routes.txt in default dependency graph.
0.1.0 (2017-09-23)
------------------
* First release on PyPI.
Partridge
=========
.. image:: https://img.shields.io/pypi/v/partridge.svg
:target: https://pypi.python.org/pypi/partridge
.. image:: https://img.shields.io/travis/remix/partridge.svg
:target: https://travis-ci.org/remix/partridge
Partridge is python library for working with
`GTFS <https://developers.google.com/transit/gtfs/>`__ feeds using
`pandas <https://pandas.pydata.org/>`__ DataFrames.
The implementation of Partridge is heavily influenced by our experience
at `Remix <https://www.remix.com/>`__ ingesting, analyzing, and
debugging thousands of GTFS feeds from hundreds of agencies.
At the core of Partridge is a dependency graph rooted at ``trips.txt``.
When reading the contents of a feed, disconnected data is pruned away
according to this graph. The root node can optionally be filtered to
create a view of the feed specific to your needs. It's most common to
filter a feed down to specific dates (``service_id``), routes
(``route_id``), or both.
.. figure:: dependency-graph.png
:alt: dependency graph
Usage
-----
.. code:: python
import datetime
import partridge as ptg
path = 'path/to/sfmta-2017-08-22.zip'
service_ids_by_date = ptg.read_service_ids_by_date(path)
feed = ptg.feed(path, view={
'trips.txt': {
'service_id': service_ids_by_date[datetime.date(2017, 9, 25)],
'route_id': '12300', # 18-46TH AVENUE
},
})
assert set(feed.trips.service_id) == service_ids_by_date[datetime.date(2017, 9, 25)]
assert list(feed.routes.route_id) == ['12300']
# Buses running the 18 - 46th Ave line use 88 stops (on September 25, 2017, at least).
assert len(feed.stops) == 88
Features
--------
- Surprisingly fast :)
- Load only what you need into memory
- Built-in support for resolving calendar days
- Built on pandas DataFrames
- Easily extended to support fields and files outside the official spec
(TODO: document this)
- Handle nested folders and bad data in zips
- Predictable type conversions, by default
Installation
------------
.. code:: console
pip install partridge
Thank You
---------
I hope you find this library useful. If you have suggestions for
improving Partridge, please open an `issue on
GitHub <https://github.com/remix/partridge/issues>`__.
=======
History
=======
0.3.0 (2017-10-12)
===================
* Fix service date resolution for raw_feed. Previously raw_feed considered all days of the week from calendar.txt to be active regardless of 0/1 value.
0.2.0 (2017-09-30)
===================
* Add missing edge from fare_rules.txt to routes.txt in default dependency graph.
0.1.0 (2017-09-23)
------------------
* First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
partridge-0.3.0.tar.gz
(319.3 kB
view details)
Built Distribution
File details
Details for the file partridge-0.3.0.tar.gz
.
File metadata
- Download URL: partridge-0.3.0.tar.gz
- Upload date:
- Size: 319.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d64b3b141e719c2986b03914cde1aa4dbee30220fd0fdac89a452cfadd534c3c |
|
MD5 | 78a1e5513c6206e762aa3f8d5ad8b204 |
|
BLAKE2b-256 | 8ff85511ed018c385cf3dd5f4525f313a02488b90d235c65c95cadaf02ad8b6d |
File details
Details for the file partridge-0.3.0-py2.py3-none-any.whl
.
File metadata
- Download URL: partridge-0.3.0-py2.py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ddc01fed91f6dc54339ffe1e82b67799caa62e3768156c0ec95a112beae753d7 |
|
MD5 | 3208e74ed44ea826dbbb85b4a866772b |
|
BLAKE2b-256 | 627130f7ca11ad920a27c1dc4bef4f5a748fada101c5f249c45516d2f1d2b60b |