Save an RSS or ATOM feed to a SQLITE database
Project description
feed-to-sqlite
Download an RSS or Atom feed and save it to a SQLite database. This is meant to work well with datasette.
Installation
pip install feed-to-sqlite
CLI Usage
Let's grab the ATOM feeds for items I've shared on NewsBlur and my instapaper favorites save each its own table.
feed-to-sqlite feeds.db http://chrisamico.newsblur.com/social/rss/35501/chrisamico https://www.instapaper.com/starred/rss/13475/qUh7yaOUGOSQeANThMyxXdYnho
This will use a SQLite database called feeds.db, creating it if necessary. By default, each feed gets its own table, named based on a slugified version of the feed's title.
To load all items from multiple feeds into a common (or pre-existing) table, pass a --table argument:
feed-to-sqlite feeds.db --table links <url> <url>
That will put all items in a table called links.
Each feed also creates an entry in a feeds table containing top-level metadata for each feed. Each item will have a foreign key to the originating feed. This is especially useful if combining feeds into a shared table.
Python API
One function, ingest_feed, does most of the work here. The following will create a database called feeds.db and download my NewsBlur shared items into a new table called links.
from feed_to_sqlite import ingest_feed
url = "http://chrisamico.newsblur.com/social/rss/35501/chrisamico"
ingest_feed("feeds.db", url=url, table_name="links")
Transforming data on ingest
When working in Python directly, it's possible to pass in a function to transform rows before they're saved to the database.
The normalize argument to ingest_feed is a function that will be called on each feed item, useful for fixing links or doing additional work.
It's signature is normalize(table, entry, feed_details, client):
tableis a SQLite table (from sqlite-utils)entryis one feed item, as a dictionaryfeed_detailsis a dictionary of top-level feed information, as a dictionaryclientis an instance ofhttpx.Client, which can be used for outgoing HTTP requests during normalization
That function should return a dictionary representing the row to be saved. Returning a falsey value for a given row will cause that row to be skipped.
Development
Tests use pytest. Run pytest tests/ to run the test suite.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file feed_to_sqlite-0.6.2.tar.gz.
File metadata
- Download URL: feed_to_sqlite-0.6.2.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3156aac7a397a1af4c48528be34d74e406c9ed855af05a9b30ac4e9260148ee
|
|
| MD5 |
d67f320a686f9888781f3bdd92008b8c
|
|
| BLAKE2b-256 |
88189c4f3c590f488b5e8a520ccd74710b878765569cb0f3fcfd203f48bcf163
|
File details
Details for the file feed_to_sqlite-0.6.2-py3-none-any.whl.
File metadata
- Download URL: feed_to_sqlite-0.6.2-py3-none-any.whl
- Upload date:
- Size: 5.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83b2adaf6b9c33167c45f6e174ad4368327abc6cc1d4df19c4693f22c92c04f7
|
|
| MD5 |
aa47f28869c1f242ffa457b0ab207c01
|
|
| BLAKE2b-256 |
6fc8c6cac0aedd59600e5f684734e8b9948fc0dfcce917400b8c89c0131ee7b4
|