Skip to main content

A python library for scraping bike sharing data

Project description

pybikes Build Status Chat on Matrix

pybikes

pybikes provides a set of tools to scrape bike sharing data from different websites and APIs, thus providing a coherent and generalized set of classes and methods to access this sort of information.

The library is distributed and intended mainly for statistics and data sharing projects. More importantly, it powers the CityBikes project, and is composed of a set of classes and a pack of data files that provide instances for all different systems.

Installation

Install from PyPI:

pip install pybikes

Install directly from GitHub:

pip install git+https://github.com/eskerda/pybikes.git

Or after downloading/cloning the source:

pip install .

Usage

>>> import pybikes

# Capital BikeShare instantiation data is in bixi.json file
>>> capital_bikeshare = pybikes.get('capital-bikeshare')

# The instance contains all possible metadata regarding this system
>>> print(capital_bikeshare.meta)
{
    'name': 'Capital BikeShare',
    'city': 'Washington, DC - Arlington, VA',
    'longitude': -77.0363658,
    'system': 'Bixi',
    'company': ['PBSC'],
    'country': 'USA',
    'latitude': 38.8951118
}
# The update method retrieves the list of stations
>>> print(len(capital_bikeshare.stations))
0
>>> capital_bikeshare.update()
>>> print(len(capital_bikeshare.stations))
191
>>> print(capital_bikeshare.stations[0])
--- 31000 - 20th & Bell St ---
bikes: 7
free: 4
latlng: 38.8561,-77.0512
extra: {
    'installed': True,
    'uid': 1,
    'locked': False,
    'removalDate': '',
    'installDate': '1316059200000',
    'terminalName': '31000',
    'temporary': False,
    'name': '20th & Bell St',
    'latestUpdateTime': '1353454305589'
}

Some systems might require an API key to work (for instance, Digitranist). In these cases, the instance factory can take an extra API key parameter.

>>> key = "This is not an API key"
>>> helsinki = pybikes.get('citybikes-helsinki', key=key)

Note that pybikes works as an instance factory and, choicely, instances can be generated by passing the right arguments to the desired class

>>> from pybikes.cyclocity import BixiSystem
>>> capital_bikeshare = BixiSystem(
        tag = 'foo_tag',
        root_url = 'http://capitalbikeshare.com/data/stations/',
        meta = {'foo':'bar'}
    )

The way information is retrieved can be tweaked using the PyBikesScraper class included on the utils module thus allowing session reusing and niceties such as using a proxy. This class uses Requests module internally.

>>> scraper = pybikes.utils.PyBikesScraper()
>>> scraper.enableProxy()
>>> scraper.setProxies({
        "http" : "127.0.0.1:8118",
        "https": "127.0.0.1:8118"
    })
>>> scraper.setUserAgent("Walrus™ v3.0")
>>> scraper.headers['Foo'] = 'bar'
>>> capital_bikeshare.update(scraper)

Adding a new system

You can scaffold a new bike share system by running

$ python -m utils.scaffold example

================================================
Here is your 'Example' implementation

System: pybikes/example.py
Data: pybikes/data/example.json

Run tests by:
$ pytest -k Example

Visualize result:
$ make map! T_FLAGS+='-k Example'

Happy hacking :)
================================================

Check docs for more information

Tests

Tests are separated between unit tests and integration tests with the different sources supported.

To run unit tests simply

make test

To run integration tests

make test-update

Note that some systems require authorization keys, tests expect these to be set as environment variables like:

PYBIKES_DIGITRANSIT='some-api-key'
PYBIKES_DEUTSCHEBAHN_CLIENT_ID='some-client-id'
PYBIKES_DEUTSCHEBAHN_CLIENT_SECRET='some-client-secret'

# or if using an .env file
# source .env

make test-update

This project uses pytest for tests. Test a particular network by passing a filter expresson

pytest -k bicing
pytest -k gbfs

To speed up tests execution, install pytest-xdist to specify the number of CPUs to use

pytest -k gbfs -n auto

To use Makefile steps and pass along pytest arguments, append to the T_FLAGS variable

make test-update T_FLAGS+='-n 10 -k gbfs'

Integration tests can generate a json report file with all extracted data stored as geojson. Using this json report file, further useful reports can be generated like a summary of the overall health of the library or a map visualization of all the information.

For more information on reports see utils/README.md

Development

We welcome contributions from the community! The best place to get started is by diving into the codebase or checking the issues list.

Join our developer community on Matrix: #citybikes:matrix.org

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybikes-1.0.8.tar.gz (490.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pybikes-1.0.8-py3-none-any.whl (513.5 kB view details)

Uploaded Python 3

File details

Details for the file pybikes-1.0.8.tar.gz.

File metadata

  • Download URL: pybikes-1.0.8.tar.gz
  • Upload date:
  • Size: 490.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pybikes-1.0.8.tar.gz
Algorithm Hash digest
SHA256 9774bd46759174a1ff61faf95f23cd4d4ab7a56e4214608fd3812de7929ae243
MD5 77b1d28091393e6b1500ead1362ab39c
BLAKE2b-256 829a64342875951e2a67f194753e16f1d4761e449acd6ecca4e15d1eea3bb49e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pybikes-1.0.8.tar.gz:

Publisher: release.yml on eskerda/pybikes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pybikes-1.0.8-py3-none-any.whl.

File metadata

  • Download URL: pybikes-1.0.8-py3-none-any.whl
  • Upload date:
  • Size: 513.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pybikes-1.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 f84cd73d3ee7caf7eb0462b3fb23f54d5737ab15c5e812a9749ba317ca22d3ce
MD5 77810c073d5080a0881286c003de18a5
BLAKE2b-256 bf6ddace9296a2ddd6fb95e68709950f4338d42bc9503abe7a8782fd8691fc7e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pybikes-1.0.8-py3-none-any.whl:

Publisher: release.yml on eskerda/pybikes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page