Skip to main content

Script to parse MOM's website for report and stats updates

Project description

mom_scrape

example usage:

from mom_scrape.scrapers import ReportScraper, StatsScraper

# this retrieves the MOM reports
# take as input the selinium webdriver
reportscraper=ReportScraper(driver=driver, 
filter_by='Reports',
save_dir='stats/reports')

results=reportscraper.get_info()

# for the StatsScraper (this retrieves the stats)
statsscraper=StatsScraper(driver=driver, 
                        dataset_of_interest=INTEREST[url], 
                        website_link=url, 
                        save_as_zip=True, 
                        save_dir='stats')
# use save_as_zip if you want to save as zip, then send out the email later

results=statsscraper.get_info()

ReportScraper

This first creates a dates folder then dump the dates of the various repoerts in the mom_reports.json file. Then, the new PDFs (these are PDFs that are not already recorded in mom_reports.json file) are downloaded under save_dir

StatsScraper

Likewise, this also creates dumps the seens dates in the dates folder. Then, for new stats, we will download and save it under self.save_dir folder

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mom_scrape-0.0.7.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mom_scrape-0.0.7-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file mom_scrape-0.0.7.tar.gz.

File metadata

  • Download URL: mom_scrape-0.0.7.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mom_scrape-0.0.7.tar.gz
Algorithm Hash digest
SHA256 6cebe634767b024aacf910b626088d086d24a21a466a7ba6fbfe4ca77d40eb5f
MD5 df87e2d477cd0eb0fd88bf27429430e5
BLAKE2b-256 0996001c5aa466e52a3a27a87cfa4a5889bc307ea8ba914fb0a5ae0a77d17ac7

See more details on using hashes here.

File details

Details for the file mom_scrape-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: mom_scrape-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mom_scrape-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 34bedf177c58fba0d0aeb4aa32884010b85b15e6fc7ff7dc265998fa877f2ff4
MD5 74580223a6d7c6b546c78b6b178d8291
BLAKE2b-256 621c4a287259a42640d516f811fdc0bd98e9678e459cb25651d06e11b26a943b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page