Skip to main content

bicidata is a framework to work with the General Bikeshare Feed Specification (GBFS)

Project description

bicidata

bicidata from bici the equivalent of bike in Spanish, Catalan, Portuguese or Galician.

bici is pronounced in two syllables bi-ci, which can be pronounced as something similar to be-cy or be-thy for a English speaker. The first one approximates better to Catalan, Portuguese and Latin Spanish, the second one to Spanish and Galician.

bicidata is a framework to work with the General Bikeshare Feed Specification (GBFS) data and aims to develop several services to collect, process and publish data from GBFS feeds different front-ends, such as, social media bots, etc.

It is based upon Python 3, with a very unstable package API until v1.0.0 is reached. You may use the code here, but I must warn you we are in the early stage of development.

Installation

You can use a deployed PyPI version of bicidata:

pip install bicidata

However, that version may be outdated, so I recommend to install directly from GitHub:

pip install git+https://github.com/bicidata/bicidata.git

Or, if you want to have access to the code to develop new features (PRs are welcome!):

git clone https://github.com/bicidata/bicidata.git & cd bicidata
pip install -e .

Services

bicidata it is thought as a framework to provide services to work with GBFS data. These services could run together as a Python app or be launched alone (i.e. dockerized). Thus, scalability will be one of the main goals of bicidata. Some services have already profiled:

  • Snapshots: as GBFS data is updated in "real-time" and it is not stored we need to do it for ourselves.
  • Archivers: assuming GBFS data is stored in raw JSON format, and a snapshot is taken every minute, the amount of data per day reaches around ~200MB for the city of Barcelona. So, some kind of data preprocessing is needed, it could be something as zipper, or something more powerful, as data bases.
  • Reporters: we want data with swag, so we will create reports with the available data.
  • Publishers: and finally we want the data to become available to the public.

Snapshots

This is the first service that is being implemented, it creates snapshots of a given GBFS API at the current timestamp.

Run

python -m bicidata.services.snapshot

and it will create a snapshot of a live GBFS API in you filesystem to acquire its data. If you want to loop it, you perfectly can do something like this from inside Python:

import time
from bicidata.services.snapshot import Snapshot, GBFSOnlineResource, FileStorageSaver

num_snapshots = 60
snapshot_sample_time = 60  # time in seconds

snapshot = Snapshot(
    GBFSOnlineResource("https://barcelona.publicbikesystem.net/ube/gbfs/v1/gbfs.json"),
    FileStorageSaver(),
)

for _ in range(60):
    snapshot.run()   
    time.sleep(snapshot_sample_time)

To consume the acquired data, you will find some examples at the scripts/ folder. To check for an advanced snap-shooter, consider run the server app.

Archivers

For the moment, there are any services to compress the data to more advanced structures than JSON, but I'm playing with pandas and xarray at scripts/create_dataset.py, so take a look if you want!

Reporters

Same here, there is a compiled dataset in the repo, so, if you want to play with it, feel free, at scripts/scripts.py you will find this to start playing with:

import pandas as pd
import xarray as xr

dataset = xr.open_dataset("data/gbfs_bcn_dump_20200925.dat")

capacity = int(dataset.capacity.sum())
print(f"'Bicing' total capacity: {capacity}")

max_bikes_available = int(dataset.num_bikes_available.sum("station_id").max())
print(f"'Bicing' max bikes available: {max_bikes_available}")

when_max_bikes_available = pd.to_datetime(
    dataset.times[dataset.num_bikes_available.sum("station_id").argmax()].values
)
print(f"When max bikes are available in UTC+0: {when_max_bikes_available}")

With should produce something like:

'Bicing' total capacity: 13328
'Bicing' max bikes available: 4116
When max bikes are available in UTC+0: 2020-09-25 01:35:09

Server

All services are planed to run in a server instance, this server first will be implemented in python, but the idea is to run it using docker-compose or kubernetes, at some point...

Now, it is implemented as a 24h snap-shooter,this will change soon with a more elegant way to run it. For the moment:

python -m bicidata.apps.server

You can configure the server changing the .env.template to .env and placing there your desired configuration.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for bicidata, version 0.0.4
Filename, size File type Python version Upload date Hashes
Filename, size bicidata-0.0.4.tar.gz (4.9 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page