Save an RSS or ATOM feed to a SQLITE database
Project description
feed-to-sqlite
Download an RSS or Atom feed and save it to a SQLite database. This is meant to work well with datasette.
Installation
pip install feed-to-sqlite
CLI Usage
Let's grab the ATOM feeds for items I've shared on NewsBlur and my instapaper favorites save each its own table.
feed-to-sqlite feeds.db http://chrisamico.newsblur.com/social/rss/35501/chrisamico https://www.instapaper.com/starred/rss/13475/qUh7yaOUGOSQeANThMyxXdYnho
This will use a SQLite database called feeds.db
, creating it if necessary. By default, each feed gets its own table, named based on a slugified version of the feed's title.
To load all items from multiple feeds into a common (or pre-existing) table, pass a --table
argument:
feed-to-sqlite feeds.db --table links <url> <url>
That will put all items in a table called links
.
Each feed also creates an entry in a feeds
table containing top-level metadata for each feed. Each item will have a foreign key to the originating feed. This is especially useful if combining feeds into a shared table.
Python API
One function, ingest_feed
, does most of the work here. The following will create a database called feeds.db
and download my NewsBlur shared items into a new table called links
.
from feed_to_sqlite import ingest_feed
url = "http://chrisamico.newsblur.com/social/rss/35501/chrisamico"
ingest_feed("feeds.db", url=url, table_name="links")
Transforming data on ingest
When working in Python directly, it's possible to pass in a function to transform rows before they're saved to the database.
The normalize
argument to ingest_feed
is a function that will be called on each feed item, useful for fixing links or doing additional work.
It's signature is normalize(table, entry, feed_details, client)
:
table
is a SQLite table (from sqlite-utils)entry
is one feed item, as a dictionaryfeed_details
is a dictionary of top-level feed information, as a dictionaryclient
is an instance ofhttpx.Client
, which can be used for outgoing HTTP requests during normalization
That function should return a dictionary representing the row to be saved. Returning a falsey value for a given row will cause that row to be skipped.
Development
Tests use pytest. Run pytest tests/
to run the test suite.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file feed-to-sqlite-0.5.1.tar.gz
.
File metadata
- Download URL: feed-to-sqlite-0.5.1.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f7ef6c4b3e95899b366d8ba3ba831f8fa5c0ee59b8cd81b2481800e6e5c4cefc |
|
MD5 | 618e83ca3c99b71e23a296f287a04f2a |
|
BLAKE2b-256 | dab8c41627e76dfa03bfd120874ad59003906a364b9e1f8f5727b49b8fa1168d |
File details
Details for the file feed_to_sqlite-0.5.1-py3-none-any.whl
.
File metadata
- Download URL: feed_to_sqlite-0.5.1-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f32f50f88898866d41eb516ba90a4f072e21c6f1dbbbc40a6990620f07f76be7 |
|
MD5 | 1344f03208420a9058287a23a57fd165 |
|
BLAKE2b-256 | 747a4de9bbb4d4096e5f8986b2e2bc73ef96801e02a24f7db38a40bcbb5cf0c2 |