Skip to main content

DIY Atom feeds in times of social media and paywalls

Project description

pypi version supported Python version licence

Read the documentation at https://pyfeeds.readthedocs.io/en/latest/

GitHub PyFeeds CI

Once upon a time every website offered an RSS feed to keep readers updated about new articles/blog posts via the users’ feed readers. These times are long gone. The once iconic orange RSS icon has been replaced by “social share” buttons.

Feeds aims to bring back the good old reading times. It creates Atom feeds for websites that don’t offer them (anymore). It allows you to read new articles of your favorite websites in your feed reader (e.g. TinyTinyRSS) even if this is not officially supported by the website.

Furthermore it can also enhance existing feeds by inlining the actual content into the feed entry so it can be read without leaving the feed reader.

Feeds is based on Scrapy, a framework for extracting data from websites, and it’s easy to add support for new websites. Just take a look at the existing spiders and feel free to open a pull request!

Documentation

Feeds comes with extensive documentation. It is available at https://pyfeeds.readthedocs.io.

Supported Websites

Feeds is currently able to create full text Atom feeds for various sites. The complete list of supported websites is available in the documentation.

Content behind paywalls

Some sites (Falter, Konsument, LWN, Oberösterreichische Nachrichten, Übermedien) offer articles only behind a paywall. If you have a paid subscription, you can configure your username and password in feeds.cfg and also read paywalled articles from within your feed reader. For the less fortunate who don’t have a subscription, paywalled articles are tagged with paywalled so they can be filtered, if desired.

All feeds contain the articles in full text so you never have to leave your feed reader while reading.

Installation

Feeds is meant to be installed on your server and run periodically in a cron job or similar job scheduler. We recommend to install Feeds inside a virtual environment.

Feeds can be installed from PyPI using pip:

$ pip install PyFeeds

You may also install the current development version. The master branch is considered stable enough for daily use:

$ pip install https://github.com/pyfeeds/pyfeeds/archive/master.tar.gz

After installation feeds is available in your virtual environment.

Feeds supports Python 3.8+.

Quickstart

  • List all available spiders:

    $ feeds list
  • Feeds allows to crawl one or more spiders without configuration, e.g.:

    $ feeds crawl tvthek.orf.at
  • A configuration file is supported too. Simply copy the template configuration and adjust it. Enable the spiders you are interested in and adjust the output path where Feeds stores the scraped Atom feeds:

    $ cp feeds.cfg.dist feeds.cfg
    $ $EDITOR feeds.cfg
    $ feeds --config feeds.cfg crawl
  • Point your feed reader to the generated Atom feeds and start reading. Feeds works best when run periodically in a cron job.

  • Run feeds --help or feeds <subcommand> --help for help and usage details.

Caching

Feeds caches HTTP responses by default to save bandwidth. Entries are cached for 90 days by default (this can be overwritten in the config file). Outdated entries are purged from cache automatically after a crawl. It’s also possible to explicitly purge the cache from outdated entries:

$ feeds --config feeds.cfg cleanup

How to contribute

Issues

Pull requests

  • Fork the project to your private repository.

  • Create a topic branch and make your desired changes.

  • Open a pull request. Make sure the GitHub CI checks are passing.

Authors

Feeds is written and maintained by Florian Preinstorfer and Lukas Anzinger.

License

AGPL3, see https://pyfeeds.readthedocs.io/en/latest/license.html for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfeeds-2024.5.1.tar.gz (82.0 kB view details)

Uploaded Source

Built Distribution

PyFeeds-2024.5.1-py3-none-any.whl (85.4 kB view details)

Uploaded Python 3

File details

Details for the file pyfeeds-2024.5.1.tar.gz.

File metadata

  • Download URL: pyfeeds-2024.5.1.tar.gz
  • Upload date:
  • Size: 82.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for pyfeeds-2024.5.1.tar.gz
Algorithm Hash digest
SHA256 07a9afcdad26a32d6dab61873ff0023f6bd348fde049e14f20988d22694e56d5
MD5 da13d62672b29de988c945d3a26cca19
BLAKE2b-256 4924ea2de50dbbf88a0a1803a477a2479eeb0778fd39ecf28ad61ef42994fd69

See more details on using hashes here.

File details

Details for the file PyFeeds-2024.5.1-py3-none-any.whl.

File metadata

  • Download URL: PyFeeds-2024.5.1-py3-none-any.whl
  • Upload date:
  • Size: 85.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for PyFeeds-2024.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 79463e94bf7872914b4c4f9667b59abf5150f201abc84e20cbe1d3f1d5513243
MD5 2fb3fe9693227b11b3c6900c99483cc9
BLAKE2b-256 ac472be95034a29fce1092d9e1b89f4c8b297db5b8943be27c24460ba9a0232e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page