Python parser for USFM files, based on tree-sitter-usfm3
Project description
USFM-Grammar
The python library that facilitates
- Parsing and validation of USFM files using
tree-sitter-usfm3
- Conversion of USFM files to other formats (USX, dict, list etc)
- Extraction of specific contents from USFM files like scripture alone(clean verses), notes (footnotes, cross-refs) etc
Built on python 3.10
Installation
pip install usfm-grammar
This requires a C compiler. On Windows, Microsoft Visual C++ 14.0 or above is required.
It is recommended that you update pip
, setuptools
and wheel
.
Usage
By importing library in Python code
from usfm_grammar import USFMParser, Filter
# input_usfm_str = open("sample.usfm","r", encoding='utf8').read()
input_usfm_str = '''
\\id GEN
\\c 1
\\p
\\v 1 test verse
'''
my_parser = USFMParser(input_usfm_str)
errors = my_parser.errors
print(errors)
To convert to USX
from lxml import etree
usx_elem = my_parser.to_usx() # default filter=ALL
print(etree.tostring(usx_elem, encoding="unicode", pretty_print=True))
To convert to Dict
output = my_parser.to_dict() # default all markers
#output = my_parser.to_dict([Filter.SCRIPTURE_TEXT])
#output = my_parser.to_dict([Filter.NOTES])
#output = my_parser.to_dict([Filter.NOTES, Filter.ATTRIBUTES])
#output = my_parser.to_dict([Filter.SCRIPTURE_TEXT, Filter.TITLES, Filter.PARAGRAPHS)
print(output)
To save as json
import json
dict_output = my_parser.to_dict()
with open("file_path.json", "w", encoding='utf-8') as fp:
json.dump(dict_output, fp)
To convert to List or table like format
list_output = my_parser.to_list()
#list_output = my_parser.to_list([Filter.SCRIPTURE_TEXT])
table_output = "\n".join(["\t".join(row) for row in list_output])
print(table_output)
From CLI
usage: usfm-grammar [-h] [--format {json,table,syntax-tree,usx,markdown}]
[--filter {book_headers,paragraphs,titles,scripture_text,notes,attributes,milestones,study_bible}]
[--csv_col_sep CSV_COL_SEP] [--csv_row_sep CSV_ROW_SEP]
infile
Uses the tree-sitter-usfm grammar to parse and convert USFM to "+ "Syntax-tree, JSON, CSV, USX etc.
positional arguments:
infile input usfm file
options:
-h, --help show this help message and exit
--format {json,table,syntax-tree,usx,markdown}
output format
--filter {book_headers,paragraphs,titles,scripture_text,notes,attributes,milestones,study_bible}
the type of contents to be included
--csv_col_sep CSV_COL_SEP
column separator or delimiter. Only useful with format=table.
--csv_row_sep CSV_ROW_SEP
row separator or delimiter. Only useful with format=table.
Example
>>> python3 -m usfm_grammar sample.usfm --format usx
>>> usfm-grammar sample.usfm --format usx
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file usfm_grammar-3.0.0a5-cp311-cp311-win_amd64.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 199.1 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 22807a379238ea187f003174bfcb7c7a7499b3cb6503aaca778d912c87580f9a |
|
MD5 | a7007fad73fec4dfef9c12ef2fcfba4c |
|
BLAKE2b-256 | fde0ee428dbb65287b1d6c3bc3876807f1b5729d9b24dbcabd247d88e61ebafc |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp311-cp311-win32.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp311-cp311-win32.whl
- Upload date:
- Size: 201.2 kB
- Tags: CPython 3.11, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc445d3e53b044cef5eefdee58122fe20693214cdb4df0b33e4d8640bb3e0874 |
|
MD5 | 3d25f0385ca204337e1633de85e76c00 |
|
BLAKE2b-256 | 5abc580d9951fd300159059a519161e82fe70334cb1ce1eaa1dcaada7155e658 |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp311-cp311-musllinux_1_1_x86_64.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp311-cp311-musllinux_1_1_x86_64.whl
- Upload date:
- Size: 198.5 kB
- Tags: CPython 3.11, musllinux: musl 1.1+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a04c3c03daf445502e1588accaaae3eacc58e145da1512938eb5863624ad75bd |
|
MD5 | c30363cb5f7956b0d78e0c823eded734 |
|
BLAKE2b-256 | aa2404a19e78a660c76aaf8081d7573a7d4d9fc172ad1a0c59701029d3459bf2 |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp311-cp311-musllinux_1_1_i686.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp311-cp311-musllinux_1_1_i686.whl
- Upload date:
- Size: 208.1 kB
- Tags: CPython 3.11, musllinux: musl 1.1+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1605dda58cada503e05964f9092a2923ad8ea62f4fa5a6a7bdef9b28acaf2a2 |
|
MD5 | 72c472ba60ff7659d8df1df340d9a698 |
|
BLAKE2b-256 | de0db36f2974513677ddec71e2ed1b1f6b7858b2eb7ca6339904035e384f05f2 |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 198.2 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64, manylinux: glibc 2.5+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2174df6b5457131f866f3b0ff4849bd23d679ac990216e1a73324cf3165a6e54 |
|
MD5 | c7a10326bca52f43ccd9fb9c67528eaf |
|
BLAKE2b-256 | c08f74b049a24b557e8184517ef967663f54c1622361972612a8d087d3931da9 |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 207.8 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ i686, manylinux: glibc 2.5+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cbbbfa63646af6497af60c2296ab8b83f2a3a45d4319af6b4cc3015c1afb5208 |
|
MD5 | 3f34d437f0cbb08e293d91313b208064 |
|
BLAKE2b-256 | 77fc3459196a8ef4313618f6a2be53280411cda48a86595b068d2a9d8991206a |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp311-cp311-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp311-cp311-macosx_10_9_x86_64.whl
- Upload date:
- Size: 194.0 kB
- Tags: CPython 3.11, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 263c83cb0c423ffcb2aa5a3b6d8101243ee29a543113134e8dd7f6af9d0bb17c |
|
MD5 | 6dfa5e4e2fadfd802f44a87cb8da5db6 |
|
BLAKE2b-256 | 5e21d62a59d5176f772ebcd10012bc1d97afd41563b4c71cb77d3da5c91518bf |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp310-cp310-win_amd64.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 199.0 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3dd37695875bb86dadd09953b2e71ae6e942166530de820fcd5d34df9acd185c |
|
MD5 | 83233323de89dacc5e45349be8810697 |
|
BLAKE2b-256 | 46892d9d114f475eba70029d196623b5d57befee8c14a539489b95801530c5dc |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp310-cp310-win32.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp310-cp310-win32.whl
- Upload date:
- Size: 201.2 kB
- Tags: CPython 3.10, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b462af46f72b47d6d972c5665131ec2f6c740c069418739ecc478e97aa59999 |
|
MD5 | 4c146943b7ddd4f5d953e0705616c174 |
|
BLAKE2b-256 | d5ce5401b0b5bb9776454d7486560e1ab178280645c254a5a1f8243fca78aa78 |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp310-cp310-musllinux_1_1_x86_64.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp310-cp310-musllinux_1_1_x86_64.whl
- Upload date:
- Size: 198.5 kB
- Tags: CPython 3.10, musllinux: musl 1.1+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87dd5b63e29cda14805a613cdbd57d04ad8acd36c043ee768094e876a578fdb2 |
|
MD5 | 8af82bafa8ef3d3aea0d76bc763988c4 |
|
BLAKE2b-256 | df3669440c2ff0355719277edd62d569dbec6882c26769f5760711e4d870d736 |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp310-cp310-musllinux_1_1_i686.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp310-cp310-musllinux_1_1_i686.whl
- Upload date:
- Size: 208.1 kB
- Tags: CPython 3.10, musllinux: musl 1.1+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e27ef5fe29ded4e8db4179caf35cdeb7dc12c52ad45d8619d86c100a23218b53 |
|
MD5 | e7d2af4e3fec149c2e969eb25e73c3cb |
|
BLAKE2b-256 | 4643f10c1d4e1d7ec2083b37b971eff36ec2cef450afbe67c83de3d830e5c721 |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 198.2 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64, manylinux: glibc 2.5+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c148e039c2586e2c4eb464816caf99d59fb6bbfa8520396b3d940225b0f86e4b |
|
MD5 | d3dc14e3425b51bd3e59003ab96943d1 |
|
BLAKE2b-256 | 6e253b52eb4af4690db1dc747e5d4c84689d767423a24d8cfe7a418faf15065d |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 207.8 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ i686, manylinux: glibc 2.5+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b9169182250c0afb54a867a21659e80b0b2ad75ed9c87df3e2582ef6d28af52 |
|
MD5 | e6d4bb3576ffb1a5ae6950ac933b7937 |
|
BLAKE2b-256 | 26d401939e8b8532d2ec40fc53e16a43c962c7c3ffd537f8f8d067295a9a75e3 |
Provenance
File details
Details for the file usfm_grammar-3.0.0a5-cp310-cp310-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: usfm_grammar-3.0.0a5-cp310-cp310-macosx_10_9_x86_64.whl
- Upload date:
- Size: 194.0 kB
- Tags: CPython 3.10, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66bc485e2452f72d41c92ebf7dc14574e17381f4caa08b94226dcd17f29cfdd0 |
|
MD5 | e16203db313bde9c52bf9786fc9d898a |
|
BLAKE2b-256 | a1be9c4695a1b3033cc779ae3b0771a9d8ff6e954ea74b004ef6b7d5e0dd8e5d |