Scrape article metadata and comments from DER SPIEGEL
Project description
spiegel-scraper
Scrape articles and comments from DER SPIEGEL
Setup
pip install spiegel-scraper
Usage
from datetime import date
import spiegel_scraper as spon
# list all articles from 2020-01-31
archive_entries = spon.archive.by_date(date(2020, 1, 31))
# or, for later replication, retrieve and scrape the html instead
archive_html = spon.archive.html_by_date(date(2020, 1, 31))
archive_entries_from_html = spon.archive.scrape_html(archive_html)
# fetch one article by url
article_url = archive_entries[0]['url']
article = spon.article.by_url(article_url)
# or alternatively using the html
article_html = spon.article.html_by_url(article_url)
article_from_html = spon.article.scrape_html(article_html)
# retrieve all comments for an article
comments = spon.comments.by_article_id(article['id'])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spiegel-scraper-1.1.0.tar.gz
(3.7 kB
view hashes)
Built Distribution
Close
Hashes for spiegel_scraper-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f05bd627fdc3796b380beb123d2cd42f869d7d52b0162caf6dc82c2262a1f7f |
|
MD5 | 8233db9b2812adfb6fa658298aff01fa |
|
BLAKE2b-256 | 3c3209e7c28036e7721f9c209aeda18bbd5f9e6b4a78bc84e33058f43edde166 |