Skip to main content

This package was made to edit XML files ( particularly TEI-XML files). The main purpose is to facilitate the tagging and edit of repeated citations and the inclusion of VIAF records

Project description

XMLCit package

Purpose

This packages was developed to serve the needs of the digital humanities community particularly those working with repeated citation on TEI-XML texts. Therefore the methods and functions in this package were developed with ease of use in mind and to be specifically applicable to TEI-XML standards. TEI-XML specifications can be found here TEI

The functions and methods here encompassed modify XMl documents on a large scale to avoid tedious work by the researcher. The main goald of this package was to provide a way to do the following:

  1. Tag a specific word within every instance of a specific node of the TEI-XMl document. In our case this was specific citations within the note node.

  2. Add attributes to every instance of a specific node based on the presence of an ID number attribute or a specific collection of words inside said node. These attributes where:

    a. A VIAF number, webscraped from the VIAF website

    b. an ID number

    c. A completed citation.

Usage guidelines:

To import the libary please use the next few lines in your code. It is key that for the methods inside the Insert class you define an instance of this class beforehand.

```
from XMLCit import functions # After this all functions can be found by doing the following: 
functions.AddVIAF
functions.Addtag
functions.RepeatCitation

#To use the class methods do the following:
instance = functions.Insert()
#then you can use this instance to call each of the methods and not include the self parameter
instance.ID()
instance.CitbyID()
instance.CitbyText()

```

Structure

Due to the brevity of this package only a single module was made to hold one class( with 3 methods) and the 3 additional functions.

Class description : Insert

This class focuses on the Insertion of attributes into tags in the XML documents.

Method descriptions

  1. ID
  • Inserts an 'ID' attribute inside the selected tag based on text inside XML tag.
  1. CitbyID
  • Inserts a text( in this case meant to be a full citation) in an attribute by checking for the specified ID number.
  1. CitbyText
  • Inserts a text( in this case meant to be a full citation) in an attrribute by seacrhing for matching text inside the selected tag (node)

Additional Functions descriptions

  1. AddVIAF This function adds an attribute with the VIAF (Virtual International Authority File) number of the cited text. To do this it searches the first 4 words of the citation in the VIAF database using the selenium webdriver.

    Additional Requirements: This function will need a filepath argument to be filled in directing the fucntion to find a chromedriver executable file to run the selenium webdriver. This version of the function only works with the Chrome web browser. Future updates shall expand to the Firefox webdriver.

  2. Addtag

    This function adds a tag on a specified word inside an existing tag (node) in the XML. The Initial purpose of the function (and the one contained in defaults) is to tag a specific word with the cit tag that is inside a specific tag for further inclusion of attributes with other functions in this package.

  3. RepeatCitation

    This function identifies those nodes (in this case cit tags) in which a repeated non-written out citation happens like Ibid. and op. cit. It will then transfer the closest Full Citation attribute from a node that has: the same ID number AND a completed FullCitation attribute. It can also be used to transfer the VIAF attributes as well by changing the FullCitation default

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

XMLCit-0.0.9.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

XMLCit-0.0.9-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file XMLCit-0.0.9.tar.gz.

File metadata

  • Download URL: XMLCit-0.0.9.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for XMLCit-0.0.9.tar.gz
Algorithm Hash digest
SHA256 3eae1effb7334d676b7b19b79575dfbf841e74dfb5b77feaec6ed3ad262d9e52
MD5 1c476ac6ea75999787e7368948e9dd6e
BLAKE2b-256 c43acb10652c9f9cc0b3ea3e64836ff9e08708defa8cb0bfd6b2a31f07908ac1

See more details on using hashes here.

File details

Details for the file XMLCit-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: XMLCit-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for XMLCit-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 cf79b259228b4a67a7ed4631f66fe4c971454970728ff5be9639360562decc9b
MD5 f33833259581e051ae949ef04f543a90
BLAKE2b-256 8403b69908b716b9bcd3bee6150aceffba77cb19cc62171b9b878cfb879532b7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page