Skip to main content

An open source scraper to get current news.

Project description

OpenNews

An open source news scraper (soon to be an API)

Usage

import opennews

opennews.get_news()
# title="We spent 5 days testing the iPhone 13 to see if it's worth the upgrade" link='https://www.cnn.com/2021/09/21/cnn-underscored/apple-iphone-13-review/index.html' summary="If you're in the market for an iPhone and have an 11 or older, now is a really ideal time to upgrade." author=None published='Tue, 21 Sep 2021 13:00:53 GMT' published_parsed=[2021, 9, 21, 13, 0, 53, 1, 264, 0] tags=[] media_content=[Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-super-169.jpg', medium='image', width='1100', height='619'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-large-11.jpg', medium='image', width='300', height='300'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-vertical-large-gallery.jpg', medium='image', width='414', height='552'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-video-synd-2.jpg', medium='image', width='640', height='480'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-live-video.jpg', medium='image', width='576', height='324'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-t1-main.jpg', medium='image', width='250', height='250'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-vertical-gallery.jpg', medium='image', width='270', height='360'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-story-body.jpg', medium='image', width='300', height='169'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-t1-main.jpg', medium='image', width='250', height='250'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-assign.jpg', medium='image', width='248', height='186'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-hp-video.jpg', medium='image', width='256', height='144')]
# ... (a ton more)

It also supports async

import opennews
import asyncio

asyncio.run(opennews.get_news_async())
# title="We spent 5 days testing the iPhone 13 to see if it's worth the upgrade" link='https://www.cnn.com/2021/09/21/cnn-underscored/apple-iphone-13-review/index.html' summary="If you're in the market for an iPhone and have an 11 or older, now is a really ideal time to upgrade." author=None published='Tue, 21 Sep 2021 13:00:53 GMT' published_parsed=[2021, 9, 21, 13, 0, 53, 1, 264, 0] tags=[] media_content=[Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-super-169.jpg', medium='image', width='1100', height='619'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-large-11.jpg', medium='image', width='300', height='300'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-vertical-large-gallery.jpg', medium='image', width='414', height='552'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-video-synd-2.jpg', medium='image', width='640', height='480'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-live-video.jpg', medium='image', width='576', height='324'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-t1-main.jpg', medium='image', width='250', height='250'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-vertical-gallery.jpg', medium='image', width='270', height='360'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-story-body.jpg', medium='image', width='300', height='169'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-t1-main.jpg', medium='image', width='250', height='250'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-assign.jpg', medium='image', width='248', height='186'), Media(url='https://cdn.cnn.com/cnnnext/dam/assets/210920231929-3-iphone-13-underscored-review-hp-video.jpg', medium='image', width='256', height='144')]
# ... (a ton more)

The scraper currently only scrapes from

  • FOX
  • CNN
  • The Guardian
  • NBC
  • CBS
  • WSJ
  • The New York Times
  • Reuters
  • USA Today
  • Washington Post
  • Huffington Post
  • NPR
  • BBC
  • ED Gov
  • Science Daily
  • Nature
  • NASA Picture of the Day
  • WIRED
  • MacWorld
  • PC World
  • Animal of the Day
  • ABC Australia

If you want to use one of these, you may do

import opennews

opennews.get_news("cnn")

# This supports async too!

import asyncio

asyncio.run(opennews.get_news_async("cnn"))

The name you can use is the name of the website, lowercased and without spaces.

(From the code: rss_sources[source[0].lower().replace(" ", "")])

License

This repository is under the LGPL License as described in the LICENSE file.

Contributing

Please open a PR on GitHub if you want to contribute!

Todo

  • Add more sources
  • Make an API

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opennews-0.1.1.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

opennews-0.1.1-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file opennews-0.1.1.tar.gz.

File metadata

  • Download URL: opennews-0.1.1.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.10.2 Linux/5.16.11-arch1-1

File hashes

Hashes for opennews-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2b317e56b20f5fe7f08840bd20c42506f12fae81cfa9303c5dee64d9f1f3879a
MD5 349c14bb3c6122e51e45dd1e00a5550a
BLAKE2b-256 ac981cb330e20bc86b1ccbd874b354dffcfe924fc937d44994bfa4dbf0bb59c5

See more details on using hashes here.

File details

Details for the file opennews-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: opennews-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.10.2 Linux/5.16.11-arch1-1

File hashes

Hashes for opennews-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 11b4e7193f7e84aea8800ee96a1c82a9eb8a9db1774d8c91e0fddcb440b20e40
MD5 0fad50fbd7604b07a171c7968b6f61df
BLAKE2b-256 0464cbb0ec02294cb51eb5cfe541a86fb6a2600e49396465e8cfbd8ec9ff65b6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page