Skip to main content

A native Python2/3 reader module for the SWORD Project Bible Modules

Project description

A native Python reader of the SWORD Project Bible Modules

This project is not an official CrossWire project. It merely provides an alternative way to read the bible modules created by CrossWires SWORD project.

Features

  • Read SWORD bibles (not commentaries etc.)

  • Detection of locally installed bible modules.

  • Supports all known SWORD module formats (ztext, ztext4, rawtext, rawtext4)

  • Read from zipped modules, like those available from http://www.crosswire.org/sword/modules/ModDisp.jsp?modType=Bibles

  • Clean text of OSIS, GBF or ThML tags.

  • Supports both python 2 and 3 (tested with 2.7 and 3.5)

License

Since parts of the code is derived and/or copied from the SWORD project (see canons.py) which is GPL2, this code is also under the GPL2 license.

Installation

PySwords source code can be downloaded from PySwords release list, but it is also available from PyPI for install using pip or easy_install. It also available for ArchLinux (AUR), and will soon be available as a package in Debian and Fedora.

Example code

Use modules from default datapath

from pysword.modules import SwordModules
# Find available modules/bibles in standard data path.
# For non-standard data path, pass it as an argument to the SwordModules constructor.
modules = SwordModules()
# In this case we'll assume the modules found is something like:
# {'KJV': {'description': 'KingJamesVersion(1769)withStrongsNumbersandMorphology', 'encoding': 'UTF-8', ...}}
found_modules = modules.parse_modules()
bible = modules.get_bible_from_module('KJV')
# Get John chapter 3 verse 16
output = bible.get(books=['john'], chapters=[3], verses=[16])

Load module from zip-file

from pysword.modules import SwordModules
# Load module in zip
# NB: the zip content is only available as long as the SwordModules object exists
modules = SwordModules('KJV.zip')
# In this case the module found is:
# {'KJV': {'description': 'KingJamesVersion(1769)withStrongsNumbersandMorphology', 'encoding': 'UTF-8', ...}}
found_modules = modules.parse_modules()
bible = modules.get_bible_from_module('KJV')
# Get John chapter 3 verse 16
output = bible.get(books=['john'], chapters=[3], verses=[16])

Manually create bible

from pysword.bible import SwordBible
# Create the bible. The arguments are:
# SwordBible(<module path>, <module type>, <versification>, <encoding>, <text formatting>)
# Only the first is required, the rest have default values which should work in most cases.
bible = SwordBible('/home/me/.sword/modules/texts/ztext/kjv/', 'ztext', 'default', 'utf8', 'OSIS')
# Get John chapter 3 verse 16
output = bible.get(books=['john'], chapters=[3], verses=[16])

Run tests

To run the testsuite, first run the script that download the files used for testing, and then use nosetests to run the testsuite:

$ python tests/resources/download_bibles.py
$ nosetests -v tests/

The tests should run and pass using both python 2 and 3.

Contributing

If you want to contribute, you are most welcome to do so! Feel free to report issues and create merge request at https://gitlab.com/tgc-dk/pysword If you create a merge request please include a test the proves that your code actually works.

Module formats

I’ll use Python’s struct module’s format strings to describe byte formatting. See https://docs.python.org/3/library/struct.html

There are current 4 formats for bible modules in SWORD.

ztext format documentation

Take the Old Testament (OT) for example. Three files:

  • ot.bzv: Maps verses to character ranges in compressed buffers. 10 bytes (‘<IIH’) for each verse in the Bible:

    • buffer_num (I): which compressed buffer the verse is located in

    • verse_start (I): the location in the uncompressed buffer where the verse begins

    • verse_len (H): length of the verse, in uncompressed characters

These 10-byte records are densely packed, indexed by VerseKey ‘Indicies’ (docs later). So the record for the verse with index x starts at byte 10*x.

  • ot.bzs: Tells where the compressed buffers start and end. 12 bytes (‘<III’) for each compressed buffer:

    • offset (I): where the compressed buffer starts in the file

    • size (I): the length of the compressed data, in bytes

    • uc_size (I): the length of the uncompressed data, in bytes (unused)

These 12-byte records are densely packed, indexed by buffer_num (see previous). So the record for compressed buffer buffer_num starts at byte 12*buffer_num.

  • ot.bzz: Contains the compressed text. Read ‘size’ bytes starting at ‘offset’.

ztext4 format documentation

ztext4 is the same as ztext, except that in the bzv-file the verse_len is now represented by 4-byte integer (I), making the record 12 bytes in all.

rawtext format documentation

Again OT example. Two files:

  • ot.vss: Maps verses to character ranges in text file. 6 bytes (‘<IH’) for each verse in the Bible:

    • verse_start (I): the location in the textfile where the verse begins

    • verse_len (H): length of the verse, in characters

  • ot: Contains the text. Read ‘verse_len’ characters starting at ‘verse_start’.

rawtext4 format documentation

rawtext4 is the same as rawtext, except that in the vss-file the verse_len is now represented by 4-byte integer (I), making the record 8 bytes in all.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysword-0.2.2.tar.gz (23.0 kB view hashes)

Uploaded source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page