Skip to main content

A python package to scrape data from GhanaWeb

Project description

GhanaWeb Scraper

A simple unofficial python package to scrape data from Ghanaweb. Affiliated to bank-of-ghana-fx-rates

How to install

pip install ghanaweb-scraper

Warning: DO NOT RUN IN ONLINE JUPYTERNOTEBOOKS eg. Colabs

GhanaWeb Urls:

urls = [
    "https://www.ghanaweb.com/GhanaHomePage/regional/"	
    "https://www.ghanaweb.com/GhanaHomePage/editorial/"
    "https://www.ghanaweb.com/GhanaHomePage/health/"
    "https://www.ghanaweb.com/GhanaHomePage/diaspora/"
    "https://www.ghanaweb.com/GhanaHomePage/tabloid/"
    "https://www.ghanaweb.com/GhanaHomePage/africa/"
    "https://www.ghanaweb.com/GhanaHomePage/religion/"
    "https://www.ghanaweb.com/GhanaHomePage/NewsArchive/"
    "https://www.ghanaweb.com/GhanaHomePage/business/"
    "https://www.ghanaweb.com/GhanaHomePage/SportsArchive/"
    "https://www.ghanaweb.com/GhanaHomePage/entertainment/"
    "https://www.ghanaweb.com/GhanaHomePage/africa/"
    "https://www.ghanaweb.com/GhanaHomePage/television/"
]

Usage

from ghanaweb.scraper import GhanaWeb

url = 'https://www.ghanaweb.com/GhanaHomePage/politics/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/health/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/crime/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/regional/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'

# web = GhanaWeb(url='https://www.ghanaweb.com/GhanaHomePage/politics/')
web = GhanaWeb(url=url)
# scrape data and save to `current working dir`
web.download(output_dir=None)

scrape list of articles from GhanaWeb

from ghanaweb.scraper import GhanaWeb

urls = [
        'https://www.ghanaweb.com/GhanaHomePage/politics/',
        'https://www.ghanaweb.com/GhanaHomePage/health/',
        'https://www.ghanaweb.com/GhanaHomePage/crime/',
        'https://www.ghanaweb.com/GhanaHomePage/regional/',
        'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'
    ]

for url in urls:
    print(f"Downloading: {url}")
    web = GhanaWeb(url=url)
    # download to current working directory
    # if no location is specified
    # web.download(output_dir="/Users/tsiameh/Desktop/")
    web.download(output_dir=None)

Scrape data from MyJoyOnline

from myjoyonline.scraper import MyJoyOnline

url = 'https://www.myjoyonline.com/news/',

print(f"Downloading data from: {url}")
joy = MyJoyOnline(url=url)
# download to current working directory
# if no location is specified
# joy.download(output_dir="/Users/tsiameh/Desktop/")
joy.download()
from myjoyonline.scraper import MyJoyOnline

urls = [
        'https://www.myjoyonline.com/news/',
        'https://www.myjoyonline.com/entertainment/',
        'https://www.myjoyonline.com/business/',
        'https://www.myjoyonline.com/sports/',
        'https://www.myjoyonline.com/opinion/'
    ]

for url in urls:
    print(f"Downloading data from: {url}")
    joy = MyJoyOnline(url=url)
    # download to current working directory
    # if no location is specified
    # joy.download(output_dir="/Users/tsiameh/Desktop/")
    joy.download()

BuyMeCoffee

Build

Credits

  • Theophilus Siameh
tsiameh twitter

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ghanaweb-scraper-1.0.2.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ghanaweb_scraper-1.0.2-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file ghanaweb-scraper-1.0.2.tar.gz.

File metadata

  • Download URL: ghanaweb-scraper-1.0.2.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for ghanaweb-scraper-1.0.2.tar.gz
Algorithm Hash digest
SHA256 a27931e95deb115bdc255e9d77a778628debf06b818b70509d12d34382efa3dd
MD5 f6fcd2cbf91e78965363fcce052b12b4
BLAKE2b-256 9e583bf9643febbf1bfcec7e574c3fb14afd8bf7e07d03ea1f0863e6391d6b5b

See more details on using hashes here.

File details

Details for the file ghanaweb_scraper-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for ghanaweb_scraper-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 08550c2602785fe30d5ce253b978173ced00040d1160da6e94ee6acdb93054ec
MD5 45effafa969a65198f84cfc091d6ff66
BLAKE2b-256 0db203998f1f2a85a47c0cf2793cf284e53dc939ed1d7e61ef612819568664a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page