Skip to main content

Python Boilerplate contains all the boilerplate you need to create a Python package.

Project description

https://img.shields.io/pypi/v/dp_mobility_report.svg Documentation Status

! This package is still in alpha-version status !

dp_mobility_report: A python package to create a standardized mobility report with differential privacy guarentees, especially for urban human mobility data.

Install

pip install dp-mobility-report

or from GitHub:

pip install git+https://github.com/FreeMoveProject/dp_mobility_report

Data preparation:

  • df: a pandas DataFrame. Expected columns: User ID uid, Trip ID tid, Timestamp datetime, latitude and longitude in CRS EPSG:4326 lat and lng.

  • tessellation: a geopandas GeoDataFrame with polygons. Expected columns tile_id. The tessellation is used for spatial aggregations of the data.

Create a mobility report as HTML:

import pandas as pd
import geopandas as gpd
from dp_mobility_report import md_report

# -- insert paths --
df = pd.read_csv("mobility_dataset.csv")
tessellation = gpd.read_file("tessellation.gpkg")

report = md_report.MobilityDataReport(
    df, tessellation, privacy_budget=1, max_trips_per_user=4
)

report.to_file("my_mobility_report.html")

The parameter privacy_budget (in terms of epsilon-differential privacy) determines how much noise is added to the data. The budget is split between all analyses of the report. If the value is set to None no noise (i.e., no privacy guarantee) is applied to the report.

The parameter max_trips_per_user specifies how many trips a user can contribute to the dataset at most. If a user is represented with more trips, a random sample is drawn according to max_trips_per_user. If the value is set to None the full dataset is used. Note, that deriving the maximum trips per user from the data violates the differential privacy guarantee. Thus, None should only be used in combination with privacy_budget=None.

Example HTMLs can be found in the examples folder.

Credits

This package was highly inspired by the pandas-profiling/pandas-profiling and scikit-mobility packages.

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.0.1 (2021-12-16)

0.0.2 (2022-07-22)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dp-mobility-report-0.0.3.tar.gz (68.5 kB view hashes)

Uploaded Source

Built Distribution

dp_mobility_report-0.0.3-py2.py3-none-any.whl (72.0 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page