A tool for scraping bible verses from the web
Project description
Biblescrapeway
A scraping tool for pulling bible verses from the web, check it out here!
Basic Usage
Install with pip
$ pip3 install biblescrapeway
CLI
biblescrapeway
comes with a simple cli (bsw
) to pull specific bible passages:
$ bsw John3.16
You can also specify a translation (default is ESV):
$ bsw --translation KJV John3.16
Or, get multiple verses with comma delimiting:
$ bsw John3.16,1Peter3:8
Or, get a range of verses using a hyphon
$ bsw John3.16-17
You can specify a formatting type with the --format/-f
option, which exposes raw json:
$ bsw -f json John3.16
You can also set the --cache/--no-cache
flag to cache the results of queries locally, so
that they can just be looked up on repeated evaluations. By default, bsw
uses --cache
.
$ bsw --no-cache John3.16 # scraps the verse from the web
$ bsw --no-cache John3.16 # scraps the verse from the web again
$ bsw --cache John3.16 # scraps the verse, then saves it locally at '~/.bsw_cache.json'
$ bsw --cache John3.16 # looks up the verse locally, does not re-scrape it
$ bsw --no-cache John3.16 # scraps the verse from the web again
Programmatic
It is also possible to get full verse objects via python, using the query
function:
from biblescrapeway import query
verse = query("John 3:16", version = "NIV")[0]
verse.to_dict()
The function returns a scraper.Verse
object, which can be convered into a dict
using
the .to_dict()
method. The resulting object has the following format:
{
"book" : "str | name of the bible book",
"chapter" : "int | chapter number",
"verse" : "int | verse number",
"version" : "str | bible version abbreviation",
"text" : "str | text content of the verse",
"footnotes" : [
{
"str_index" : "int | index in text string of footnote location",
"html" : "str | html of footnote content"
}
],
"crossrefs" : [
{
"str_index" : "int | index in text string of footnote location",
"ref_list" : "list | list of strings of cross referenced verses"
}
]
}
The caching functionality is also accessible from the query
function as:
verse_list = query("John3.16", cache=True) # scraps from the web
verse_list = query("John3.16", cache=True) # just looks result up
Development
# Create the venv
python3 -m venv venv
./venv/bin/pip install -r requirements.txt
# install for development
./venv/bin/pip install --editable .
# Test
./scripts/run_tests.sh
# Build
./scripts/build.sh
# Deploy
twine upload dist/*
Known Bugs
TODO
- Add more than just bgw as the scraping backend
- More carefully handle formatting (unicode, text transforms, woj, etc).
- Add WAY more documentations, like some docstrings for the modules . .
- Add more unit tests
- expand cli?
- finish
string_cleaner
to convert special unicode characters into simpler characters - standardize some of the naming -- inconsisten use of
reference
to sometimes meanRange
, also,scrape
is pretty overloaded. - Descide how to handle 'Genesis 1:3-4:5,6', does that last one mean verse 6 or chapter 6?
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file biblescrapeway-0.3.2.tar.gz
.
File metadata
- Download URL: biblescrapeway-0.3.2.tar.gz
- Upload date:
- Size: 13.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59d52dbd8bb6b36cb74f2450c6329de28932d6021fecf81872494039099ee2bf |
|
MD5 | 6e785608af73cb3b8d577f27ddf861d9 |
|
BLAKE2b-256 | 1fbe3a6e2505dfdaabb6a31b3656eea8a967b307ceee027ab804cb01779ab6cb |