Skip to main content

Provide an API to search for articles on Google News and returns a usable JSON response.

Project description

Contributors Forks Stargazers Issues MIT License Download LinkedIn


GNews

GNews 📰

A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a usable JSON response! 🚀
If you like ❤️ GNews or find it useful 🌟, support the project by buying me a coffee ☕.
Buy Me A Coffee

🚀 View Demo · 🐞 Report Bug · 🚀 Request Feature

Table of Contents 📑
  1. About 🚩
  2. Getting Started 🚀
  3. Usage 🧩
  4. To Do 📋
  5. Roadmap 🛣️
  6. Contributing 🤝
  7. License ⚖️
  8. Contact 📬
  9. Acknowledgements 🙏

About GNews

🚩 GNews is A Happy and lightweight Python Package that searches Google News RSS Feed and returns a usable JSON response
🚩 As well as you can fetch full article (No need to write scrappers for articles fetching anymore)

Google News cover across 141+ countries with 41+ languages. On the bottom left side of the Google News page you may find a Language & region section where you can find all of the supported combinations.

Demo

GNews Demo

Getting Started

This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

Installation

pip install gnews

Setup with Docker

Developing with docker

  1. Install docker and docker-compose.
  2. Set-up your .env environment placing the mongo db credentials.
  3. Run docker-compose up --build

Install using clone

  1. Clone this repository https://github.com/ranahaani/GNews.git
  2. Start your virtual environment virtualenv gnews
  3. Install the requirements with pip install -r requirements.txt

Example usage

from gnews import GNews

google_news = GNews()
pakistan_news = google_news.get_news('Pakistan')
print(pakistan_news[0])
[{
'publisher': 'Aljazeera.com',
 'description': 'Pakistan accuses India of stoking conflict in Indian Ocean  '
                'Aljazeera.com',
 'published date': 'Tue, 16 Feb 2021 11:50:43 GMT',
 'title': 'Pakistan accuses India of stoking conflict in Indian Ocean - '
          'Aljazeera.com',
 'url': 'https://www.aljazeera.com/news/2021/2/16/pakistan-accuses-india-of-nuclearizing-indian-ocean'
 },
 ...]

Get top news

  • GNews.get_top_news()

Get news by keyword

  • GNews.get_news(keyword)

Get news by major topic

  • GNews.get_news_by_topic(topic)
  • Available topics: WORLD, NATION, BUSINESS, TECHNOLOGY, ENTERTAINMENT, SPORTS, SCIENCE, HEALTH.

Get news by geo location

  • GNews.get_news_by_location(location)
  • location can be name of city/state/country

Get news by site

  • GNews.get_news_by_site(site)
  • site should be in the format of: "cnn.com"

Results specification

  • It's possible to pass proxy, country, language, period, start date, end date exclude websites and size during initialization
google_news = GNews(language='en', country='US', period='7d', start_date=None, end_date=None, max_results=10, exclude_websites=['yahoo.com', 'cnn.com'],
                    proxy=proxy)
  • Or change it to an existing object
google_news.period = '7d'  # News from last 7 days
google_news.max_results = 10  # number of responses across a keyword
google_news.country = 'United States'  # News from a specific country 
google_news.language = 'english'  # News in a specific language
google_news.exclude_websites = ['yahoo.com', 'cnn.com']  # Exclude news from specific website i.e Yahoo.com and CNN.com
google_news.start_date = (2020, 1, 1) # Search from 1st Jan 2020
google_news.end_date = (2020, 3, 1) # Search until 1st March 2020

The format of the timeframe is a string comprised of a number, followed by a letter representing the time operator. For example 1y would signify 1 year. Full list of operators below:

 - h = hours (eg: 12h)
 - d = days (eg: 7d)
 - m = months (eg: 6m)
 - y = years (eg: 1y)

Setting the start and end dates can be done by passing in either a datetime or a tuple in the form (YYYY, MM, DD).

Supported Countries

print(google_news.AVAILABLE_COUNTRIES)

{'Australia': 'AU', 'Botswana': 'BW', 'Canada ': 'CA', 'Ethiopia': 'ET', 'Ghana': 'GH', 'India ': 'IN',
 'Indonesia': 'ID', 'Ireland': 'IE', 'Israel ': 'IL', 'Kenya': 'KE', 'Latvia': 'LV', 'Malaysia': 'MY', 'Namibia': 'NA',
 'New Zealand': 'NZ', 'Nigeria': 'NG', 'Pakistan': 'PK', 'Philippines': 'PH', 'Singapore': 'SG', 'South Africa': 'ZA',
 'Tanzania': 'TZ', 'Uganda': 'UG', 'United Kingdom': 'GB', 'United States': 'US', 'Zimbabwe': 'ZW',
 'Czech Republic': 'CZ', 'Germany': 'DE', 'Austria': 'AT', 'Switzerland': 'CH', 'Argentina': 'AR', 'Chile': 'CL',
 'Colombia': 'CO', 'Cuba': 'CU', 'Mexico': 'MX', 'Peru': 'PE', 'Venezuela': 'VE', 'Belgium ': 'BE', 'France': 'FR',
 'Morocco': 'MA', 'Senegal': 'SN', 'Italy': 'IT', 'Lithuania': 'LT', 'Hungary': 'HU', 'Netherlands': 'NL',
 'Norway': 'NO', 'Poland': 'PL', 'Brazil': 'BR', 'Portugal': 'PT', 'Romania': 'RO', 'Slovakia': 'SK', 'Slovenia': 'SI',
 'Sweden': 'SE', 'Vietnam': 'VN', 'Turkey': 'TR', 'Greece': 'GR', 'Bulgaria': 'BG', 'Russia': 'RU', 'Ukraine ': 'UA',
 'Serbia': 'RS', 'United Arab Emirates': 'AE', 'Saudi Arabia': 'SA', 'Lebanon': 'LB', 'Egypt': 'EG',
 'Bangladesh': 'BD', 'Thailand': 'TH', 'China': 'CN', 'Taiwan': 'TW', 'Hong Kong': 'HK', 'Japan': 'JP',
 'Republic of Korea': 'KR'}

Supported Languages

print(google_news.AVAILABLE_LANGUAGES)

{'english': 'en', 'indonesian': 'id', 'czech': 'cs', 'german': 'de', 'spanish': 'es-419', 'french': 'fr',
 'italian': 'it', 'latvian': 'lv', 'lithuanian': 'lt', 'hungarian': 'hu', 'dutch': 'nl', 'norwegian': 'no',
 'polish': 'pl', 'portuguese brasil': 'pt-419', 'portuguese portugal': 'pt-150', 'romanian': 'ro', 'slovak': 'sk',
 'slovenian': 'sl', 'swedish': 'sv', 'vietnamese': 'vi', 'turkish': 'tr', 'greek': 'el', 'bulgarian': 'bg',
 'russian': 'ru', 'serbian': 'sr', 'ukrainian': 'uk', 'hebrew': 'he', 'arabic': 'ar', 'marathi': 'mr', 'hindi': 'hi',
 'bengali': 'bn', 'tamil': 'ta', 'telugu': 'te', 'malyalam': 'ml', 'thai': 'th', 'chinese simplified': 'zh-Hans',
 'chinese traditional': 'zh-Hant', 'japanese': 'ja', 'korean': 'ko'}

Article Properties

  • Get news returns the list with following keys: title, published_date, description, url, publisher.
Properties Description Example
title Title of the article IMF Staff and Pakistan Reach Staff-Level Agreement on the Pending Reviews Under the Extended Fund Facility
url Google news link to article Article Link
published date Published date Wed, 07 Jun 2017 07:01:30 GMT
description Short description of article IMF Staff and Pakistan Reach Staff-Level Agreement on the Pending Reviews Under the Extended Fund Facility ...
publisher Publisher of article The Guardian

Getting full article

  • To read a full article you can either:
    • Navigate to the url directly in your browser, or
    • Use newspaper3k library to scrape the article
  • The article url, needed for both methods, is accessed as article['url'].

Using newspaper3k

  1. Install the library - pip3 install newspaper3k.
  2. Use get_full_article method from GNews, that creates an newspaper.article.Article object from the url.
from gnews import GNews

google_news = GNews()
json_resp = google_news.get_news('Pakistan')
article = google_news.get_full_article(
    json_resp[0]['url'])  # newspaper3k instance, you can access newspaper3k all attributes in article

This new object contains title, text (full article) or images attributes. Examples:

article.title 

IMF Staff and Pakistan Reach Staff-Level Agreement on the Pending Reviews Under the Extended Fund Facility'

article.text 

End-of-Mission press releases include statements of IMF staff teams that convey preliminary findings after a mission. The views expressed are those of the IMF staff and do not necessarily represent the views of the IMF’s Executive Board.\n\nIMF staff and the Pakistani authorities have reached an agreement on a package of measures to complete second to fifth reviews of the authorities’ reform program supported by the IMF Extended Fund Facility (EFF) ..... (full article)

article.images

{'https://www.imf.org/~/media/Images/IMF/Live-Page/imf-live-rgb-h.ashx?la=en', 'https://www.imf.org/-/media/Images/IMF/Data/imf-logo-eng-sep2019-update.ashx', 'https://www.imf.org/-/media/Images/IMF/Data/imf-seal-shadow-sep2019-update.ashx', 'https://www.imf.org/-/media/Images/IMF/Social/TW-Thumb/twitter-seal.ashx', 'https://www.imf.org/assets/imf/images/footer/IMF_seal.png'}

article.authors

[]

Read full documentation for newspaper3k newspaper3k

Todo

  • Save to MongoDB
  • Save to SQLite
  • Save to JSON
  • Save to .CSV file
  • More than 100 articles

Roadmap

See the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Muhammad Abdullah - @ranahaani - ranahaani@gmail.com

Project Link: https://github.com/ranahaani/GNews

Project Link: https://github.com/ranahaani/GNews

"Buy Me A Coffee"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gnews-0.3.7.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gnews-0.3.7-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file gnews-0.3.7.tar.gz.

File metadata

  • Download URL: gnews-0.3.7.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for gnews-0.3.7.tar.gz
Algorithm Hash digest
SHA256 08c1d1452e9b294d3f0a4ee09775ea8452869ed6d25c2ec0e9f112786d3fedd4
MD5 8d13a29ec1a0e10bdf3fe26ade191589
BLAKE2b-256 4aad62a0e16ca53ebf8eb7ee7f71344a2aec6fdf5a7ab33c9173b6cc629a091c

See more details on using hashes here.

File details

Details for the file gnews-0.3.7-py3-none-any.whl.

File metadata

  • Download URL: gnews-0.3.7-py3-none-any.whl
  • Upload date:
  • Size: 15.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for gnews-0.3.7-py3-none-any.whl
Algorithm Hash digest
SHA256 570dc0fc3537b140bc33f423f202356fc4d8250e300201f1c8c7f1b8bdc7e69b
MD5 c85e99d90b3a19ffa5f5941bb5376b7f
BLAKE2b-256 791ab53ef12ec52da4bba3d270d34ded74a476e2611b521396c096dbd075fdf1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page