A collection of Orange3 widgets to process TEI-XML files
Project description
orange3-teixml
This provides a collection of widgets for processing TEI-XML documents.
Installation
Within the Add-ons installer, click on "Add more..." and type in orange3-teixml
Widgets
- TEI Token Extractor: Parses through TEI-XML files in a directory and extracts the words with annotations (like parts of speech) and counts the occurances. It then takes the
Nmost frequent words across all the documents and filters the results with those words.
⚠️ If you load TEI-XML documents from different sources, it is very likely that the annotation schemes are different and the tokens won't match. In my examples below, "An abridgement of the English military discipline," was loaded from a different source.
Screenshots
A screenshot of a simple Orange workflow with the TEI Token Extractor feeding a data table.
The TEI Token Extractor widget set with the "inputs" directory selected, the number of top tokens set to 15, and the normalize checkbox cleared.
Data table with the counts of tokens for various Shakesphere works and An abdridgement of English military discipline.
Data table with the frequency of tokens for various Shakesphere works and An abdridgement of English military discipline.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file orange3_teixml-0.0.1.tar.gz.
File metadata
- Download URL: orange3_teixml-0.0.1.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d06dfdfc47b470796d666e05dbeccc15656b43260ac5df6cba5245274b7623b
|
|
| MD5 |
bd317ef9d2f1f2b716f6dc13525aa4de
|
|
| BLAKE2b-256 |
65b9c06c7b4adc4576dfcd0fdd2df67856b3f1684d87f5ae4fedfa390af2c8e0
|
File details
Details for the file orange3_teixml-0.0.1-py3-none-any.whl.
File metadata
- Download URL: orange3_teixml-0.0.1-py3-none-any.whl
- Upload date:
- Size: 12.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee57c7198f8e30b082d605c8c795ad71c3162790511d7efe1a4f199ff96ce44d
|
|
| MD5 |
a231c37b6e0775feacdf2dd6e1e5faec
|
|
| BLAKE2b-256 |
020584ef03ca9a021a1769ae8bcf851f6058060f33104d893802e11f3fbe3ce1
|