Standalone Django based feed aggregator.
Project description
What it does
Newspeak is a feed aggregator with advanced features for keyword filtering and link content extraction, implemented as a standaloone Django application.
Architecture
Newspeak performs the following tasks (in order):
Fetch specified RSS/Atom feeds as per the Feed model (in parallel).
Parses the feeds using feedparser.
(Optionally) applies per-feed inclusive/exclusive keyword filters on the title and/or summary, based on the KeywordFilter model.
(Optionally) extract summary data using an XPath expression from feed entry’s link URL, using lxml.
(Optionally) extract enclosure information using XPath expressions from the feed entry’s link URL, using lxml.
Store the resulting feed information locally in a database.
Serve the aggregate of all the feed entries in a single RSS/Atom feed.
The flow of feed data through the application is roughly as follows (given some example feeds and keyword filters):
[Feed 1]-[Keyword filter 1]-[Keyword filter 2]-[XPath content extraction]-----------------------------`\ [Feed 2]--------------------[Keyword filter 3]-[XPath summary extraction]-[XPath content extraction ] -+--[Aggregate output feed] [Feed 3]-[Keyword filter 3]-[Keyword filter 4]---------------------------------------------------------/
Installing
Getting started with newspeak is really easy thanks to David Cramer’s awesome logan for making standalone Django apps. Simply perform the following steps:
Install such that you can easily code along:
pip install -e \ git+https://github.com/bitsoffreedom/newspeak.git#egg=newspeak
If you’re smart and like to keep your Python environment clean, do this in a VirtualEnv.
Initialize configuration in ~/.newspeak/newspeak.conf.py:
newspeak init
Perform (optional) configuration by editing the settings file. Because Newspeak is based on Django, all available Django settings can be used. Furthermore, there are some Newspeak-specific settings:
NEWSPEAK_THREADS: The number of (lightweight) threads used for crawling feed data.
NEWSPEAK_METADATA: Metadata used in the generated output feed.
For a more thorough description and an example of these settings, please have a look at the initial settings file generated in the previous step.
(Optionally) Run the tests:
newspeak test newspeak
This might take a while, so go fetch a cup of coffee. If something fails, please supply the output of the command newspeak test newspeak –traceback in an issue on GitHub.
Create admin user and SQLite database (proper database is optional):
newspeak syncdb --migrate
Start the local webserver:
newspeak run_gunicorn
Open http://127.0.0.1:8000/admin/ in your browser, add some feed. Only the URL is required, the description and title will be fetched automatically, as well as the first set of entries.
(Optionally) Configure one or more keyword-based filters for your feed(s).
Make sure the following command gets executed to update the feeds:
newspeak update_feeds
(Optionally, add -v <1|2|3> to get more feedback on the process.)
Look at the pretty feeds: open http://127.0.0.1:8000/all/rss/ or http://127.0.0.1:8000/all/atom/ in your favorite feed reader. All input feeds will be aggregated there.
Alternatively, the original feeds, keywords and XPath expressions as used by Bits of Freedom are contained in a fixture called feeds_bof.json. This fixture can be loaded using:
newspeak loaddata feeds_bof
Setup a Cronjob to automatically update the feed data using the newspeak update_feeds command. For example, a cron job updating the feeds every hour could look as follows:
0 * * * * <full_path_to_>/newspeak update_feeds
Upgrading
Run the PIP installation command again:
pip install -e \ git+https://github.com/bitsoffreedom/newspeak.git#egg=newspeak
(Optionally) Run the tests:
newspeak test newspeak
Apply any database migrations:
newspeak migrate
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file newspeak-0.1.tar.gz
.
File metadata
- Download URL: newspeak-0.1.tar.gz
- Upload date:
- Size: 29.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f61105d96e8a92d227b7fe7fe49dcb61c047948338c67204c2b2053d7a4500da |
|
MD5 | 064413de8c9b8ccb876406776c676d09 |
|
BLAKE2b-256 | 082ac84583264cb6dc5f6f0c4a0f4a51b67052d14d9f924e5b045b4956877e46 |