Package that allows you to find ESG ratings from Yahoo Finance, MSCI, CSR Hub, S&P Global, SustainAnalytics. In addition, financial information is also scraped from Yahoo Finance

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Companies ESG Scraper

This repository is created to build a dataset of companies' financial metrics and its corresponding ESG and CSR metrics. The ESG information has been collected from 5 different publically available data analytics firm. The websites used for ESG metrics are: S&P Global, MSCI, Yahoo Finance, CSR Hub, SustainAnalytics. The website used to collect financial information is Yahoo Finance.

Project Motivation

This package is for investors interested in Sustainable Finance. The package can be used to gather the publically available data on ESG and financial metrics. This can help investors to choose a company based on its sustainability as well as financial performance. This data can be further used to decipher if there is a relation between company's financial status with its performance in Sustainability.

Package structure:

WebScraping

- ESGmetrics

    -- esgscraper

        --- __init__.py
        --- __main__.py
        --- scraper.py 
        --- csrhub.py
        --- snp_global.py
        --- msci.py
        --- sustainanalytics.py
        --- yahoo.py

    -- rds_uploader
        --- __init__.py
        --- rds_module.py
        
- tests
- Forbes.csv
- .gitignore
- LICENSE
- README.md
- setup.cfg
- setup.py

How to use

Run the command "python -m esgmetrics.esgscraper" .
It asks for the following inputs: Website name (type a corresponding number), Input file path, Header name in the input file that contains companies name, output file path including the output file name, and Chromedriver path. Enter the inputs.
Wait for the output! You will notice a .csv file created that gets appended as new informatin is scraped
To access the different functions used in the package, checkout the help documentation of scraper module Package details

Selenium is used to scrape all the websites. Below is the information for each of the .py files:

1. __main___.py: To begin, run this file. It will ask for these inputs: 
    a) Path of the file(.csv) that contains companies name
    b) Header name of the companies column
    c) Which website to scrape the data from: SustainAnalytics, S&P Global, CSR HUB, MSCI, Yahoo
    d) Chromedriver path

2. scraper.py This Python file serves as an input for all the other .py files. It has a Class that includes all
   the methods that are used commonly in all the .py files.

3. snp_global.py : This file collects ESG score and supporting information such as Industry, Ticker, Country,
    Company Name from the S&P Global website. https://www.spglobal.com/esg/scores/

4. msci.py : This file collects ESG rating and Company name from the MSCI website. 
   https://www.msci.com/our-solutions/esg-investing/esg-ratings/

5. csrhub.py : This file collects CSR rating and Company name from the CSR Hub website.
    https://www.csrhub.com/search/name/

6. sustainanalytics.py : This file collects ESG risk rating, Company and Industry name
    from the SustainAnalytics website. https://www.sustainalytics.com/esg-ratings

7. yahoo.py : This file collects the below information from the Yahoo Finance website
    (https://finance.yahoo.com/lookup):
    ESG rating, company name and financial metrics such as Market Cap, Trailing P/E, return on asset, 
    Total Debt/Equity (mrq), Operating Cash Flow and Stock Price, Price/Book (mrq), Most Recent Quarter (mrq),
    Profit Margin, Op_Margin, return on equity, Diluted EPS, PayoutRatio

8. rds_module.py: Use the class RdsUploader to upload the data to SQL database. Inputs needed: DATATBASE_TYPE, 
    DBAPI, ENDPOINT, PORT, DATABASE, USERNAME, PASSWORD. 
    Functions: create_table, send_query, read_table, add_rows, delete_row

Example dataset

An example dataset of Forbes 2021 2000 companies is provided with this package.

Things to note:

1. Enter a file path that contains the file name with the extension .csv
2. Company names are extracted from each website so that the user can tally the company names with input companies
    names dataset 
3. Check the documentation of each script for more information

License: MIT © Shweta Yadav

Support: For any questions and suggestions, connect with me on LinkedIn: http://www.linkedin.com/in/shweta-yadav1

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

1.0.0

Sep 5, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ESGScraper-1.0.0.tar.gz (11.6 kB view hashes)

Uploaded Sep 5, 2021 Source

Built Distribution

ESGScraper-1.0.0-py3-none-any.whl (65.0 kB view hashes)

Uploaded Sep 5, 2021 Python 3

Hashes for ESGScraper-1.0.0.tar.gz

Hashes for ESGScraper-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`d4cdaf8bc8dfa79b8d1cf7740f3d5c8ad30ae513a1507434b7d4de43c065d46b`
MD5	`76327bb404ddd93587cf4db302da0ed9`
BLAKE2b-256	`3f982101b450d6dd3e6c457f9194d3a9cd169b6cce6315ed874b2f496ac376de`

Hashes for ESGScraper-1.0.0-py3-none-any.whl

Hashes for ESGScraper-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fd6d21426c11e70b6c1486db7ddb06c05a21767f612028a5a3770acfaf167590`
MD5	`998bd4f4be1a8c202fa95fe5ea4bc5e2`
BLAKE2b-256	`5121cdbfc6d949fc2bbf7228fe4970835b94db62ea98b6d272507f0116fa95fe`