Skip to main content

A python package to scrape data from GhanaWeb

Project description

GhanaWeb Scraper

A simple unofficial python package to scrape data from Ghanaweb. Affiliated to bank-of-ghana-fx-rates

How to install

pip install ghanaweb-scraper

Warning: DO NOT RUN IN ONLINE JUPYTERNOTEBOOKS eg. Colabs

GhanaWeb Urls:

urls = [
    "https://www.ghanaweb.com/GhanaHomePage/regional/"	
    "https://www.ghanaweb.com/GhanaHomePage/editorial/"
    "https://www.ghanaweb.com/GhanaHomePage/health/"
    "https://www.ghanaweb.com/GhanaHomePage/diaspora/"
    "https://www.ghanaweb.com/GhanaHomePage/tabloid/"
    "https://www.ghanaweb.com/GhanaHomePage/africa/"
    "https://www.ghanaweb.com/GhanaHomePage/religion/"
    "https://www.ghanaweb.com/GhanaHomePage/NewsArchive/"
    "https://www.ghanaweb.com/GhanaHomePage/business/"
    "https://www.ghanaweb.com/GhanaHomePage/SportsArchive/"
    "https://www.ghanaweb.com/GhanaHomePage/entertainment/"
    "https://www.ghanaweb.com/GhanaHomePage/africa/"
    "https://www.ghanaweb.com/GhanaHomePage/television/"
]

Usage

from ghanaweb.scraper import GhanaWeb

url = 'https://www.ghanaweb.com/GhanaHomePage/politics/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/health/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/crime/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/regional/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'

# web = GhanaWeb(url='https://www.ghanaweb.com/GhanaHomePage/politics/')
web = GhanaWeb(url=url)
# scrape data and save to `current working dir`
web.download(output_dir=None)

scrape list of articles from GhanaWeb

from ghanaweb.scraper import GhanaWeb

urls = [
        'https://www.ghanaweb.com/GhanaHomePage/politics/',
        'https://www.ghanaweb.com/GhanaHomePage/health/',
        'https://www.ghanaweb.com/GhanaHomePage/crime/',
        'https://www.ghanaweb.com/GhanaHomePage/regional/',
        'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'
    ]

for url in urls:
    print(f"Downloading: {url}")
    web = GhanaWeb(url=url)
    # download to current working directory
    # if no location is specified
    # web.download(output_dir="/Users/tsiameh/Desktop/")
    web.download(output_dir=None)

Scrape data from MyJoyOnline

from myjoyonline.scraper import MyJoyOnline

url = 'https://www.myjoyonline.com/news/',

print(f"Downloading data from: {url}")
joy = MyJoyOnline(url=url)
# download to current working directory
# if no location is specified
# joy.download(output_dir="/Users/tsiameh/Desktop/")
joy.download()
from myjoyonline.scraper import MyJoyOnline

urls = [
        'https://www.myjoyonline.com/news/',
        'https://www.myjoyonline.com/entertainment/',
        'https://www.myjoyonline.com/business/',
        'https://www.myjoyonline.com/sports/',
        'https://www.myjoyonline.com/opinion/'
    ]

for url in urls:
    print(f"Downloading data from: {url}")
    joy = MyJoyOnline(url=url)
    # download to current working directory
    # if no location is specified
    # joy.download(output_dir="/Users/tsiameh/Desktop/")
    joy.download()

BuyMeCoffee

Build

Credits

  • Theophilus Siameh
tsiameh twitter

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ghanaweb-scraper-1.0.2.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

ghanaweb_scraper-1.0.2-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file ghanaweb-scraper-1.0.2.tar.gz.

File metadata

  • Download URL: ghanaweb-scraper-1.0.2.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for ghanaweb-scraper-1.0.2.tar.gz
Algorithm Hash digest
SHA256 a27931e95deb115bdc255e9d77a778628debf06b818b70509d12d34382efa3dd
MD5 f6fcd2cbf91e78965363fcce052b12b4
BLAKE2b-256 9e583bf9643febbf1bfcec7e574c3fb14afd8bf7e07d03ea1f0863e6391d6b5b

See more details on using hashes here.

File details

Details for the file ghanaweb_scraper-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for ghanaweb_scraper-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 08550c2602785fe30d5ce253b978173ced00040d1160da6e94ee6acdb93054ec
MD5 45effafa969a65198f84cfc091d6ff66
BLAKE2b-256 0db203998f1f2a85a47c0cf2793cf284e53dc939ed1d7e61ef612819568664a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page