Skip to main content

Encoding tools for DDHI

Project description

A collection of command-line utilities to assist in the creation of TEI-encoded oral history interviews. Part of the Dartmouth Digital History Initiative.

DDHI Encoder

The ddhi-encoder package is being developed to assist encoders in the DDHI project in encoding oral history interview transcripts in TEI. At present, it contains two command-line utilities:

  1. ddhi_convert: convert a Dartmouth DVP transcript from docx to tei.xml.

  2. ddhi_tag: perform named-entity tagging on a DDHI TEI transcription.

Installation

You can use pip to install this package:

pip install ddhi-encoder

To peform named-entity tagging with ddhi_tag, you will need a Spacy model. Before running ddhi_tag, install Spacy’s small English model:

python -m spacy download en_core_web_sm

See the Spacy documentation for more information.

Use

Use ddhi_convert to transform a DOCX-encoded transcription into a simply structured TEI document:

ddhi_convert ~/Desktop/transcripts/zien_jimmy_transcript_final.docx -o tmp.tei.xml

Use ddhi_tag to add named-entity tags to a TEI-encoded transcription:

ddhi_tag -o zien.tei.xml tmp.tei.xml

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddhi-encoder-1.0.9.tar.gz (108.0 kB view details)

Uploaded Source

Built Distribution

ddhi_encoder-1.0.9-py2.py3-none-any.whl (16.4 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file ddhi-encoder-1.0.9.tar.gz.

File metadata

  • Download URL: ddhi-encoder-1.0.9.tar.gz
  • Upload date:
  • Size: 108.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.1

File hashes

Hashes for ddhi-encoder-1.0.9.tar.gz
Algorithm Hash digest
SHA256 95078a26263c52f6c59d30d6485d4ac0ba3a1d687e8410cd6dc391d3efef2384
MD5 ea3c1fd831147575413f74f60b656ec3
BLAKE2b-256 8dca2e2fc606287b719fbd625785b62aeee2e74c21fac4f7b237fe3b3045d30f

See more details on using hashes here.

File details

Details for the file ddhi_encoder-1.0.9-py2.py3-none-any.whl.

File metadata

  • Download URL: ddhi_encoder-1.0.9-py2.py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.1

File hashes

Hashes for ddhi_encoder-1.0.9-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 affda719f473c0a9e8d3683b0e37b6538d2e5e5c80490213377e768214a10454
MD5 572cb75f588ce61da3a66974e0d07a0d
BLAKE2b-256 f4f94f5337c2354959c15633ed824ef4fc2978c503802daf0a9033b092ef3981

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page