Skip to main content

This package was made to edit XML files ( particularly TEI-XML files). The main purpose is to facilitate the tagging and edit of repeated citations and the inclusion of VIAF records

Project description

XMLCit package

Purpose

This packages was developed to serve the needs of the digital humanities community particularly those working with repeated citation on TEI-XML texts. Therefore the methods and functions in this package were developed with ease of use in mind and to be specifically applicable to TEI-XML standards. TEI-XML specifications can be found here TEI

The functions and methods here encompassed modify XMl documents on a large scale to avoid tedious work by the researcher. The main goald of this package was to provide a way to do the following:

  1. Tag a specific word within every instance of a specific node of the TEI-XMl document. In our case this was specific citations within the note node.

  2. Add attributes to every instance of a specific node based on the presence of an ID number attribute or a specific collection of words inside said node. These attributes where:

    a. A VIAF number, webscraped from the VIAF website

    b. an ID number

    c. A completed citation.

Usage guidelines:

To import the libary please use the next few lines in your code. It is key that for the methods inside the Insert class you define an instance of this class beforehand.

```
from XMLCit import functions # After this all functions can be found by doing the following: 
functions.AddVIAF
functions.Addtag
functions.RepeatCitation

#To use the class methods do the following:
instance = functions.Insert()
#then you can use this instance to call each of the methods and not include the self parameter
instance.ID()
instance.CitbyID()
instance.CitbyText()

```

Structure

Due to the brevity of this package only a single module was made to hold one class( with 3 methods) and the 3 additional functions.

Class description : Insert

This class focuses on the Insertion of attributes into tags in the XML documents.

Method descriptions

  1. ID
  • Inserts an 'ID' attribute inside the selected tag based on text inside XML tag.
  1. CitbyID
  • Inserts a text( in this case meant to be a full citation) in an attribute by checking for the specified ID number.
  1. CitbyText
  • Inserts a text( in this case meant to be a full citation) in an attrribute by seacrhing for matching text inside the selected tag (node)

Additional Functions descriptions

  1. AddVIAF This function adds an attribute with the VIAF (Virtual International Authority File) number of the cited text. To do this it searches the first 4 words of the citation in the VIAF database using the selenium webdriver.

    Additional Requirements: This function will need a filepath argument to be filled in directing the fucntion to find a chromedriver executable file to run the selenium webdriver. This version of the function only works with the Chrome web browser. Future updates shall expand to the Firefox webdriver.

  2. Addtag

    This function adds a tag on a specified word inside an existing tag (node) in the XML. The Initial purpose of the function (and the one contained in defaults) is to tag a specific word with the cit tag that is inside a specific tag for further inclusion of attributes with other functions in this package.

  3. RepeatCitation

    This function identifies those nodes (in this case cit tags) in which a repeated non-written out citation happens like Ibid. and op. cit. It will then transfer the closest Full Citation attribute from a node that has: the same ID number AND a completed FullCitation attribute. It can also be used to transfer the VIAF attributes as well by changing the FullCitation default

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

XMLCit-0.0.9.tar.gz (7.8 kB view hashes)

Uploaded Source

Built Distribution

XMLCit-0.0.9-py3-none-any.whl (7.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page