Skip to main content

Python package for working with open road traffic casualty data from Great Britain

Project description

🚸 py-stats19

Authors:

Xiaowei Gao [📩 Email: ucesxwg@ucl.ac.uk] (SpacetimeLab, UCL, UK)

Jinshuai Ma [📩 Email: j.ma23@lse.ac.uk] (LSE Data Science Institute, UK)

Supervisors:

Dr. James Haworth, Associate Professor in Spatio-temporal Analytics, SpaceTimeLab, Department of Civil, Environmental and Geomatic Engineering, UCL

Prof. Tao Cheng, Professor in GeoInformatics, SpaceTimeLab, Department of Civil, Environmental and Geomatic Engineering, UCL

🚸 py-stats19 is a Python package developed to support digital twin applications for spatio-temporal urban crash analysis. Inspired by the R stats19 package package, this package provides a more efficient tool to download and format the Road Safety Data from the official Road Safety Database published by the Department for Transport, UK, since 1979. Additionally, py-stats19 enhances the data by incorporating extra temporal information and geometric details.

The whole data set contains three tables: casualty, collision, and vehicle. The data set is updated annually and contains detailed information about road traffic accidents in Great Britain.

🧰 The current py-stats19 package is under development and testing stages. It is available as a beta version for early access.

Installation

Download the latest release, e.g. pystats19-0.0.1-py3-none-any.whl.

$ pip install pystats19-0.0.1-py3-none-any.whl

list_files()

list_files() can list all available stats19 dataset files, which can be simply filtered by passing year and table arguments.

Here, you could specify the table name as casualty, collision, or vehicle. Those files could be merged by the accident_index key.

from pystats19.read import list_files

# List all files contain year 2021 data
list_files(year=2021) 
# ['dft-road-casualty-statistics-casualty-1979-latest-published-year.csv',
#  'dft-road-casualty-statistics-casualty-2021.csv',
#  'dft-road-casualty-statistics-casualty-last-5-years.csv',
#  'dft-road-casualty-statistics-collision-1979-latest-published-year.csv',
#  'dft-road-casualty-statistics-collision-2021.csv',
#  'dft-road-casualty-statistics-collision-last-5-years.csv',
#  'dft-road-casualty-statistics-vehicle-1979-latest-published-year.csv',
#  'dft-road-casualty-statistics-vehicle-2021.csv',
#  'dft-road-casualty-statistics-vehicle-e-scooter-2020-Latest-Published-Year.csv',
#  'dft-road-casualty-statistics-vehicle-last-5-years.csv']

# List all files contain year 2021 and table vehicle data
list_files(year=2021, table="vehicle")
# ['dft-road-casualty-statistics-vehicle-1979-latest-published-year.csv',
#  'dft-road-casualty-statistics-vehicle-2021.csv',
#  'dft-road-casualty-statistics-vehicle-last-5-years.csv']

pull_file()

pull() requires filename parameter, downloading the data file. filename should be obtained using list_files().

Optionally, data_dir can specify the location where the file will be stored.

from pystats19.source import pull_file

pull_file('dft-road-casualty-statistics-vehicle-2019.csv', data_dir="./data")

Data directory

Data directory can also be configured globally by setting an environment variable PYSTATS19_DOWNLOAD_DIRECTORY

$ export PYSTATS19_DOWNLOAD_DIRECTORY=~/my_pystats19_data

load()

load() loads the data file as a pandas.DataFrame or geopandas.GeoDataFrame. Set auto_download=True to automatically download the file if not exists.

Optionally,

set convert_code_to_label=True to convert categorical data codes to text labels.

set add_temporal_info=True to format datetime and time and add additional time information.

set add_geo_info=True to add geo information. This will return a geopandas.GeoDataFrame.

from pystats19.read import load

load(
    'dft-road-casualty-statistics-collision-2021.csv',
    auto_download=True,
    convert_code_to_label=True,
    add_temporal_info=True,
    add_geo_info=True
)

# Removed 17 records due to missing Latitude or Longitude.
# 
#        accident_index  ...                       geometry
# 0       2021010287148  ...   POINT (521509.659 193079.41)
# 1       2021010287149  ...  POINT (535380.824 180783.228)
# 2       2021010287151  ...  POINT (529702.828 170398.085)
# 3       2021010287155  ...  POINT (525313.658 178385.183)
# 4       2021010287157  ...  POINT (512145.497 171526.072)
# ...               ...  ...                            ...
# 101082  2021991196247  ...  POINT (325545.894 674547.399)
# 101083  2021991196607  ...  POINT (271195.339 558271.954)
# 101084  2021991197944  ...   POINT (357296.909 860766.24)
# 101085  2021991200639  ...  POINT (326935.908 675924.391)
# 101086  2021991201030  ...  POINT (270574.351 556367.939)
# [101070 rows x 39 columns]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pystats19-0.1.0.tar.gz (34.5 kB view details)

Uploaded Source

Built Distribution

pystats19-0.1.0-py3-none-any.whl (33.1 kB view details)

Uploaded Python 3

File details

Details for the file pystats19-0.1.0.tar.gz.

File metadata

  • Download URL: pystats19-0.1.0.tar.gz
  • Upload date:
  • Size: 34.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for pystats19-0.1.0.tar.gz
Algorithm Hash digest
SHA256 789e06d768b0bed6e0bf3fbc26d4bee19f9396d8a2812e6f5708e8ce5a962211
MD5 e24fde120ad53757d03d0251907165e9
BLAKE2b-256 a463da22a9de6ecebeb1792ff49253c5e336945ad8c1ebd8e3dcc4e0ad84b614

See more details on using hashes here.

File details

Details for the file pystats19-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pystats19-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 33.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for pystats19-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd9b917d9a5bc53bece5fc213df43cfa838196f4b0e178598c2e7866aa8aa5be
MD5 f1e21f2a157a05a37a6e9a7f9e242581
BLAKE2b-256 c40f64dfabda1fc6ec8e514203a64af2fb4ccf900ea3c8baedbf0f9ae18042ea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page