Skip to main content

A tool for scraping bible verses from the web

Project description

Biblescrapeway

A scraping tool for pulling bible verses from the web, check it out here!

Basic Usage

Install with pip

 $ pip3 install biblescrapeway

CLI

biblescrapeway comes with a simple cli (bsw) to pull specific bible passages:

 $ bsw John3.16

You can also specify a translation (default is ESV):

 $ bsw --translation KJV John3.16

Or, get multiple verses with comma delimiting:

 $ bsw John3.16,1Peter3:8

Or, get a range of verses using a hyphon

 $ bsw John3.16-17

You can specify a formatting type with the --format/-f option, which exposes raw json:

 $ bsw -f json John3.16

You can also set the --cache/--no-cache flag to cache the results of queries locally, so that they can just be looked up on repeated evaluations. By default, bsw uses --no-cache.

 $ bsw John3.16         # scraps the verse from the web
 $ bsw John3.16         # scraps the verse from the web again
 $ bsw --cache John3.16 # scraps the verse, then saves it locally at '~/.bsw_cache.json'
 $ bsw --cache John3.16 # looks up the verse locally, does not re-scrap it
 $ bsw John3.16         # scraps the verse from the web again

Programmatic

It is also possible to get full verse objects via python, using the query function:

from biblescrapeway import query
verse = query("John 3:16", version = "NIV")[0]
verse.to_dict()

The function returns a scraper.Verse object, which can be convered into a dict using the .to_dict() method. The resulting object has the following format:

{
    "book"    : "str | name of the bible book",
    "chapter" : "int | chapter number",
    "verse"   : "int | verse number",
    "version" : "str | bible version abbreviation",
    "text"    : "str | text content of the verse",
    "footnotes" : [
        {
            "str_index" : "int | index in text string of footnote location",
            "html"      : "str | html of footnote content"
        }
    ],
    "crossrefs" : [
        {
            "str_index" : "int  | index in text string of footnote location",
            "ref_list"  : "list | list of strings of cross referenced verses"
        }
    ]
}

The caching functionality is also accessible from the query function as:

verse_list = query("John3.16", cache=True) # scraps from the web
verse_list = query("John3.16", cache=True) # just looks result up

Development

# Create the venv
python3 -m venv venv
./venv/bin/pip install -r requirements.txt

# install for development
./venv/bin/pip install --editable .

# Test
./scripts/run_tests.sh

# Build
./scripts/build.sh

# Deploy
twine upload dist/*

Known Bugs

TODO

  • Add WAY more documentations, like some docstrings for the modules . .
  • Add more unit tests
  • expand cli?
  • finish string_cleaner to convert special unicode characters into simpler characters
  • standardize some of the naming -- inconsisten use of reference to sometimes mean Range, also, scrap is pretty overloaded.
  • Descide how to handle 'Genesis 1:3-4:5,6', does that last one mean verse 6 or chapter 6?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biblescrapeway-0.3.1.tar.gz (13.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page