Skip to main content

A collection of Orange3 widgets to process TEI-XML files

Project description

orange3-teixml

This provides a collection of widgets for processing TEI-XML documents.

Installation

Within the Add-ons installer, click on "Add more..." and type in orange3-teixml

Widgets

  • TEI Token Extractor: Parses through TEI-XML files in a directory and extracts the words with annotations (like parts of speech) and counts the occurances. It then takes the N most frequent words across all the documents and filters the results with those words.

⚠️ If you load TEI-XML documents from different sources, it is very likely that the annotation schemes are different and the tokens won't match. In my examples below, "An abridgement of the English military discipline," was loaded from a different source.

Screenshots

A screenshot of a simple Orange workflow with the TEI Token Extractor feeding a data table.

A screenshot of a simple Orange workflow with the TEI Token Extractor feeding a data table.

The TEI Token Extractor widget set with the "inputs" directory selected, the number of top tokens set to 15, and the normalize checkbox cleared.

The TEI Token Extractor widget set with the "inputs" directory selected, the number of top tokens set to 15, and the normalize checkbox cleared.

Data table with the counts of tokens for various Shakesphere works and An abdridgement of English military discipline.

Data table with the counts of tokens for various Shakesphere works and An abdridgement of English military discipline.

Data table with the frequency of tokens for various Shakesphere works and An abdridgement of English military discipline.

Data table with the frequency of tokens for various Shakesphere works and An abdridgement of English military discipline.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orange3_teixml-0.0.1.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orange3_teixml-0.0.1-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file orange3_teixml-0.0.1.tar.gz.

File metadata

  • Download URL: orange3_teixml-0.0.1.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for orange3_teixml-0.0.1.tar.gz
Algorithm Hash digest
SHA256 6d06dfdfc47b470796d666e05dbeccc15656b43260ac5df6cba5245274b7623b
MD5 bd317ef9d2f1f2b716f6dc13525aa4de
BLAKE2b-256 65b9c06c7b4adc4576dfcd0fdd2df67856b3f1684d87f5ae4fedfa390af2c8e0

See more details on using hashes here.

File details

Details for the file orange3_teixml-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: orange3_teixml-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for orange3_teixml-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ee57c7198f8e30b082d605c8c795ad71c3162790511d7efe1a4f199ff96ce44d
MD5 a231c37b6e0775feacdf2dd6e1e5faec
BLAKE2b-256 020584ef03ca9a021a1769ae8bcf851f6058060f33104d893802e11f3fbe3ce1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page