Skip to main content

Package that allows you to find ESG ratings from Yahoo Finance, MSCI, CSR Hub, S&P Global, SustainAnalytics. In addition, financial information is also scraped from Yahoo Finance

Project description

Companies ESG Scraper

image

This repository is created to build a dataset of companies' financial metrics and its corresponding ESG and CSR metrics. The ESG information has been collected from 5 different publically available data analytics firm. The websites used for ESG metrics are: S&P Global, MSCI, Yahoo Finance, CSR Hub, SustainAnalytics. The website used to collect financial information is Yahoo Finance.

Project Motivation

This package is for investors interested in Sustainable Finance. The package can be used to gather the publically available data on ESG and financial metrics. This can help investors to choose a company based on its sustainability as well as financial performance. This data can be further used to decipher if there is a relation between company's financial status with its performance in Sustainability.

image

Package structure:

WebScraping

- ESGmetrics

    -- esgscraper

        --- __init__.py
        --- __main__.py
        --- scraper.py 
        --- csrhub.py
        --- snp_global.py
        --- msci.py
        --- sustainanalytics.py
        --- yahoo.py

    -- rds_uploader
        --- __init__.py
        --- rds_module.py
        
- tests
- Forbes.csv
- .gitignore
- LICENSE
- README.md
- setup.cfg
- setup.py

How to use

  1. Run the command "python -m esgmetrics.esgscraper" .
  2. It asks for the following inputs: Website name (type a corresponding number), Input file path, Header name in the input file that contains companies name, output file path including the output file name, and Chromedriver path. Enter the inputs.
  3. Wait for the output! You will notice a .csv file created that gets appended as new informatin is scraped
  4. To access the different functions used in the package, checkout the help documentation of scraper module Package details

image

Selenium is used to scrape all the websites. Below is the information for each of the .py files:

1. __main___.py: To begin, run this file. It will ask for these inputs: 
    a) Path of the file(.csv) that contains companies name
    b) Header name of the companies column
    c) Which website to scrape the data from: SustainAnalytics, S&P Global, CSR HUB, MSCI, Yahoo
    d) Chromedriver path

2. scraper.py This Python file serves as an input for all the other .py files. It has a Class that includes all
   the methods that are used commonly in all the .py files.

3. snp_global.py : This file collects ESG score and supporting information such as Industry, Ticker, Country,
    Company Name from the S&P Global website. https://www.spglobal.com/esg/scores/

4. msci.py : This file collects ESG rating and Company name from the MSCI website. 
   https://www.msci.com/our-solutions/esg-investing/esg-ratings/

5. csrhub.py : This file collects CSR rating and Company name from the CSR Hub website.
    https://www.csrhub.com/search/name/

6. sustainanalytics.py : This file collects ESG risk rating, Company and Industry name
    from the SustainAnalytics website. https://www.sustainalytics.com/esg-ratings

7. yahoo.py : This file collects the below information from the Yahoo Finance website
    (https://finance.yahoo.com/lookup):
    ESG rating, company name and financial metrics such as Market Cap, Trailing P/E, return on asset, 
    Total Debt/Equity (mrq), Operating Cash Flow and Stock Price, Price/Book (mrq), Most Recent Quarter (mrq),
    Profit Margin, Op_Margin, return on equity, Diluted EPS, PayoutRatio

8. rds_module.py: Use the class RdsUploader to upload the data to SQL database. Inputs needed: DATATBASE_TYPE, 
    DBAPI, ENDPOINT, PORT, DATABASE, USERNAME, PASSWORD. 
    Functions: create_table, send_query, read_table, add_rows, delete_row 

Example dataset

An example dataset of Forbes 2021 2000 companies is provided with this package.

image

Things to note:

1. Enter a file path that contains the file name with the extension .csv
2. Company names are extracted from each website so that the user can tally the company names with input companies
    names dataset 
3. Check the documentation of each script for more information

License: MIT © Shweta Yadav

Support: For any questions and suggestions, connect with me on LinkedIn: http://www.linkedin.com/in/shweta-yadav1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ESGScraper-1.0.0.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

ESGScraper-1.0.0-py3-none-any.whl (65.0 kB view details)

Uploaded Python 3

File details

Details for the file ESGScraper-1.0.0.tar.gz.

File metadata

  • Download URL: ESGScraper-1.0.0.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.11

File hashes

Hashes for ESGScraper-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d4cdaf8bc8dfa79b8d1cf7740f3d5c8ad30ae513a1507434b7d4de43c065d46b
MD5 76327bb404ddd93587cf4db302da0ed9
BLAKE2b-256 3f982101b450d6dd3e6c457f9194d3a9cd169b6cce6315ed874b2f496ac376de

See more details on using hashes here.

File details

Details for the file ESGScraper-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: ESGScraper-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 65.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.11

File hashes

Hashes for ESGScraper-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd6d21426c11e70b6c1486db7ddb06c05a21767f612028a5a3770acfaf167590
MD5 998bd4f4be1a8c202fa95fe5ea4bc5e2
BLAKE2b-256 5121cdbfc6d949fc2bbf7228fe4970835b94db62ea98b6d272507f0116fa95fe

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page