Skip to main content

Python package to scrape NRC Event Reports.

Project description

Set of modules to scrape Event Reports from the NRC.gov website.

Tests

pytest 

Usage

Download all event reports into yearly csv files at .\data\ :

python bulk_downloader --start_year 2004 --end_year 2020 --threads 16

Or go one by one:

    # import a Session that has correct properties to keep a connnection
    from nrc_scrape import session

    # generate single event notification report event
    url3 = 'https://www.nrc.gov/reading-rm/doc-collections/event-status/event/2019/20190612en.html'

    h = EventNotificationReport.from_url(url3, session)

    # get event notification urls from 2019 to 2020
    er_urls = generate_nrc_event_report_urls(2019, 2020)

    from random import sample
    urls = sample(list(extract_nested_values(er_urls)), 10)

    sl = logging.getLogger('success_log')
    el = logging.getLogger('error_log')
    fl = logging.getLogger('fof_log')

    # get a subsample of the urls, and log attempts to request 
    enrs, errors, fofs = fetch_enrs(urls, session)

    # convert an event to dataframe
    df = enrs[0].events[0].to_dataframe()

    # event notification report to dataframe
    enr_df = enrs[1].to_dataframe()

    # event notification reports to dataframe
    enrs_df = pd.concat([enr.to_dataframe() for enr in enrs])

    #dump to csv
    enrs_df.to_csv('enrs.csv')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nrc-scrape-0.0.5.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nrc_scrape-0.0.5-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file nrc-scrape-0.0.5.tar.gz.

File metadata

  • Download URL: nrc-scrape-0.0.5.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.8.0

File hashes

Hashes for nrc-scrape-0.0.5.tar.gz
Algorithm Hash digest
SHA256 5ca352d2c420d5f7bf2b6c534eaba21c06d874015bd6152d07b45cbb750b9763
MD5 a2d18ec4e08dcec41b1ad124b0e11b0b
BLAKE2b-256 4587a9e1240e37c1228a113f6f03e2ebaa02763e2e21415904f869ccc1dd307a

See more details on using hashes here.

File details

Details for the file nrc_scrape-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: nrc_scrape-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.8.0

File hashes

Hashes for nrc_scrape-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 dd04984171ba6401d3688392a4ddb7abcf0fe4e385a9112388437d5b66532240
MD5 c0781c9ec6f05f6cb4e12d77bd6358c0
BLAKE2b-256 d0a95f51fdf49e7599f750949b2f5d196717770d50174078c855e627f4bf96cf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page