Skip to main content

A Python package for accessing police traffic stop, arrests, use of force, etc. data

Project description

OpenPoliceData

OpenPoliceData is a Python package for police data analysis that provides easy access to incident-level data from police departments around the United States for traffic stops, pedestrian stops, use of force, and other types of police interactions.

Installation

The source code is available at https://github.com/openpolicedata/openpolicedata.

OpenPoliceData can be installed from the Python Package Index (PyPI):

pip install openpolicedata

Additionally, geopandas can be installed to enable downloaded data tables to be returned as geopandas DataFrames instead of pandas DataFrames when there is geographic data. It is recommended to use conda to install geopandas.

Examples

Jupyter notebooks demonstrating example usage of OpenPoliceData can be found in the notebooks folder.

Contributing

If you're interesting in helping out, see our Contributing Guide

Documentation

datasets_query(source_name=None, state=None, agency=None, table_type=None)

Query the available datasets to see what is available. Various filters can be applied. By default, all datasets are returned.

> import openpolicedata as opd
> datasets = opd.datasets_query(state="California")
> datasets.head()
State SourceName Agency TableType Year
California Anaheim Anaheim TRAFFIC STOPS MULTI
California Bakersfield Bakersfield TRAFFIC STOPS MULTI
California California MULTI STOPS 2018
California California MULTI STOPS 2019
California California MULTI STOPS 2020

(only 1st 5 columns shown above)

datasets is a pandas DataFrame. The first 5 datasets available from California include traffic stops data from multiples years from Anaheim and Bakersfield and data from every agency in California for all types of police stops for years 2018, 2019, and 2020.

Source(source_name, state=None)

Create a data source. A data source allows the user to easily import or export police data. It provides access to all datasets available from a source. source_name should match a value of SourceName for an available dataset. An optional state parameter is used to resolve ambiguities when the same source name is used in multiple states (such as multiple states have State Police).

> src = opd.Source(source_name="Virginia")
> src.datasets
State SourceName Agency TableType Year
Virginia Virginia MULTI STOPS MULTI

(only 1st 5 columns shown above)

There is 1 dataset available from the state of Virginia that contains data from every agency in Virginia for all types of police stops for multiple years.

get_tables_types()

Show all types of data available from a source.

> src.get_tables_types()
['STOPS']

get_years(table_type=None)

Show years available for one or more datasets. Results can be filtered to only show years for a specific type of data.

> src.get_years(table_type="STOPS")
[2020, 2021, 2022]

get_agencies(table_type=None, year=None, partial_name=None)

Show agencies (police departments) that have data available. This is typically a single agency unless the data is from a state. Results can be filtered to only show agencies for a specific type of data and/or year. partial_name can be used to find only agencies containing a substring. This is useful for finding the exact name of a police department.

> agencies = src.get_agencies(partial_name="Arlington")
> print(agencies)
['Arlington County Police Department', "Arlington County Sheriff's Office"]

load_from_url(year, table_type=None, agency=None, pbar=True)

Import data from the source. Data for a year (i.e. 2020) or a range of years (i.e. [2020, 2022]) can be requested. If more than one data type is available, table_type must be specified. Optionally, for datasets containing multiple agencies (police departments) data, agency can be used to request data for a single agency. pbar can be set to false to not show a progress bar while loading.

> agency = "Arlington County Police Department"
> tbl = src.load_from_url(year=2021, table_type="STOPS", agency=agency)
> tbl.table.head(n=3)
incident_date agency_name agency reason_for_stop race ethnicity
2021-01-01 Arlington County Police Department ARLINGTON CO OTHER WHITE HISPANIC
2021-01-01 Arlington County Police Department ARLINGTON CO EQUIPMENT VIOLATION WHITE NON-HISPANIC
2021-01-01 Arlington County Police Department ARLINGTON CO TRAFFIC VIOLATION BLACK OR AFRICAN AMERICAN NON-HISPANIC

(only 1st 6 columns shown above)

The result of load_from_url is a Table object. The table contained in the Table object is either a geopandas or pandas DataFrame depending on whether the returned data contains geographic data or not.

to_csv(output_dir=None, filename=None)

Export table to CSV. The default output directory is the current directory. The default filename is automatically generated which enables the user to easily re-import the table to a new Table object.

> tbl.to_csv()

load_from_csv(year, output_dir=None, table_type=None, agency=None)

Import table from previously exported CSV. The directory to look in defaults to the current directory. The CSV file must have been automatically generated (see to_csv). year, table_type, and agency are defined the same as for load_from_url.

> new_src = opd.Source(source_name="Virginia")
new_t = new_src.load_from_csv(year=2021, agency=agency)


> tbl.table.head(n=3)
incident_date agency_name agency reason_for_stop race ethnicity
2021-01-01 Arlington County Police Department ARLINGTON CO OTHER WHITE HISPANIC
2021-01-01 Arlington County Police Department ARLINGTON CO EQUIPMENT VIOLATION WHITE NON-HISPANIC
2021-01-01 Arlington County Police Department ARLINGTON CO TRAFFIC VIOLATION BLACK OR AFRICAN AMERICAN NON-HISPANIC

(only 1st 6 columns shown above)

See the OpenPoliceData wiki for further documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openpolicedata-0.2.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openpolicedata-0.2-py3-none-any.whl (24.6 kB view details)

Uploaded Python 3

File details

Details for the file openpolicedata-0.2.tar.gz.

File metadata

  • Download URL: openpolicedata-0.2.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for openpolicedata-0.2.tar.gz
Algorithm Hash digest
SHA256 da0566b1c6269307b32ead4dd40ab8f718573ed175a044c5770ea91fa1d69cc9
MD5 911c6e283e19808146cf56672c88443e
BLAKE2b-256 87eb3b4e00aae6d718b28a4c1e5c137eea233f2c8912db4aabe8e4646d47bbef

See more details on using hashes here.

File details

Details for the file openpolicedata-0.2-py3-none-any.whl.

File metadata

  • Download URL: openpolicedata-0.2-py3-none-any.whl
  • Upload date:
  • Size: 24.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for openpolicedata-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 419bb7c9196d0c745dbbfdf3f8fa42b3d18dfb073f9bdcfa9f8bc36fd7f22f00
MD5 3dcb60a19e73c242903bf4e0f92fba7a
BLAKE2b-256 32d2abaa01a5c33e57b247bdd08669be8b80a1c052bed7c31114f7ec4091980f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page