Skip to main content

Script to parse MOM's website for report and stats updates

Project description

mom_scrape

example usage:

from mom_scrape.scrapers import ReportScraper, StatsScraper

# this retrieves the MOM reports
# take as input the selinium webdriver
reportscraper=ReportScraper(driver=driver, 
filter_by='Reports',
save_dir='stats/reports')

results=reportscraper.get_info()

# for the StatsScraper (this retrieves the stats)
statsscraper=StatsScraper(driver=driver, 
                        dataset_of_interest=INTEREST[url], 
                        website_link=url, 
                        save_as_zip=True, 
                        save_dir='stats')
# use save_as_zip if you want to save as zip, then send out the email later

results=statsscraper.get_info()

ReportScraper

This first creates a dates folder then dump the dates of the various repoerts in the mom_reports.json file. Then, the new PDFs (these are PDFs that are not already recorded in mom_reports.json file) are downloaded under save_dir

StatsScraper

Likewise, this also creates dumps the seens dates in the dates folder. Then, for new stats, we will download and save it under self.save_dir folder

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mom_scrape-0.0.8.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mom_scrape-0.0.8-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file mom_scrape-0.0.8.tar.gz.

File metadata

  • Download URL: mom_scrape-0.0.8.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mom_scrape-0.0.8.tar.gz
Algorithm Hash digest
SHA256 2fa375573e0f10a174df5cf76d25575303f031d46d6e8b39d649b4e8b4d28a45
MD5 565fb737fca12a580d7914b97e992bdc
BLAKE2b-256 8fffdccede5ce078f05da30f2f06ad5c9087111ad69913d51a1b46bdb2032993

See more details on using hashes here.

File details

Details for the file mom_scrape-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: mom_scrape-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mom_scrape-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 ff1f00308c89f8b6dd774a79f52f21ea6a516ea6aec259a1225040c71bce3a92
MD5 2899ee68108f4e4ff21dd9af9f03b7aa
BLAKE2b-256 3595923e90e62c66d38ab47b302c6c6b7b0945b29bca99a19b8060c2718b7dad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page