Skip to main content

Extract data from APD news site

Project description

Extract data from APD news site.

ScrAPD is a small utility designed to help organizations retrieving traffic fatality data in a friendly manner.

Installation

ScrAPD requires Python 3.7+ to work.

pip install scrapd

Quickstart

Collect all the data as CSV:

scrapd retrieve --format csv

By default, scrapd does not display anything until it is done collecting the data. If you want to get some feedback about the process, you can enable logging, by adding the -v BEFORE the command you want to use. Multiple -v options increase the verbosity. The maximum is 3 (-vvv):

scrapd -v retrieve --format csv

To save the results to a file, use the shell redirection:

scrapd -v retrieve --format csv > results.csv

Examples

Retrieve the traffic fatalities that happened between January 15th 2019 and January 18th 2019, and output the results in json:

scrapd retrieve --from "Jan 15 2019" --to "Jan 18 2019" --format json

[
  {
    "Age": 31,
    "Case": "19-0150158",
    "DOB": "07/09/1987",
    "Date": "January 15, 2019",
    "Ethnicity": "White",
    "Fatal crashes this year": "1",
    "First Name": "Hilburn",
    "Gender": "male",
    "Last Name": "Sell",
    "Link": "http://austintexas.gov/news/traffic-fatality-1-4",
    "Location": "10500 block of N IH 35 SB",
    "Time": "6:20 a.m."
  },
  {
    "Age": 58,
    "Case": "19-0161105",
    "DOB": "02/15/1960",
    "Date": "January 16, 2019",
    "Ethnicity": "White",
    "Fatal crashes this year": "2",
    "First Name": "Ann",
    "Gender": "female",
    "Last Name": "Bottenfield-Seago",
    "Link": "http://austintexas.gov/news/traffic-fatality-2-3",
    "Location": "West William Cannon Drive and Ridge Oak Road",
    "Time": "3:42 p.m."
  }
]

Do the same research but output as CSV:

scrapd retrieve --from "Jan 15 2019" --to "Jan 18 2019" --format csv


Fatal crashes this year,Case,Date,Time,Location,First Name,Last Name,Ethnicity,Gender,DOB,Age,Link
1,19-0150158,"January 15, 2019",6:20 a.m.,10500 block of N IH 35 SB,Hilburn,Sell,White,male,07/09/1987,31,http://austintexas.gov/news/traffic-fatality-1-4
2,19-0161105,"January 16, 2019",3:42 p.m.,West William Cannon Drive and Ridge Oak Road,Ann,Bottenfield-Seago,White,female,02/15/1960,58,http://austintexas.gov/news/traffic-fatality-2-3

Retrieve all the traffic fatalities from 2019 (as of Jan 20th 2019) in json, and enabling the logging to follow the progress of the process:

scrapd -v retrieve --from 2019 --format json

Fetching page 1...
Fetching page 2...
Total: 2
[
  {
    "Age": 31,
    "Case": "19-0150158",
    "DOB": "07/09/1987",
    "Date": "January 15, 2019",
    "Ethnicity": "White",
    "Fatal crashes this year": "1",
    "First Name": "Hilburn",
    "Gender": "male",
    "Last Name": "Sell",
    "Link": "http://austintexas.gov/news/traffic-fatality-1-4",
    "Location": "10500 block of N IH 35 SB",
    "Time": "6:20 a.m."
  },
  {
    "Age": 58,
    "Case": "19-0161105",
    "DOB": "02/15/1960",
    "Date": "January 16, 2019",
    "Ethnicity": "White",
    "Fatal crashes this year": "2",
    "First Name": "Ann",
    "Gender": "female",
    "Last Name": "Bottenfield-Seago",
    "Link": "http://austintexas.gov/news/traffic-fatality-2-3",
    "Location": "West William Cannon Drive and Ridge Oak Road",
    "Time": "3:42 p.m."
  }
]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

scrapd-1.0.0-py3-none-any.whl (13.5 kB view details)

Uploaded Python 3

File details

Details for the file scrapd-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: scrapd-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 13.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.19.1 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.7.2

File hashes

Hashes for scrapd-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cff1bc6eb18a52b653283edc0d3d197db9fe16933eeeea9cd410fc64fedebd53
MD5 999bc83e4cf0fbe6d59cbdad35c8b96a
BLAKE2b-256 f00a681187910d85483a096d5456c33a513952cd6a6a5c0c34fcb54789a0d09a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page