Lightweight implementation of the Typecraft XML format in python.
Project description
Typecraft Python
This repository contains an IGT model based on the Typecraft IGT format. It also contains a simple CLI for performing various NLP tasks, interfacing with both NLTK and other tools such as the TreeTagger.
Free software: MIT license
Full Documentation: https://typecraft_python.readthedocs.io.
Installation
pip install typecraft_python
Features
Parsing of the Typecraft XML format.
- Manipulation of the Typecraft IGT model format.
Integrating with NLTK
Integrating with TreeTagger
Provides a CLI that can be used to load, convert and manipulate raw text and Typecraft XML files.
Usage
Usage: tpy [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
convert
ntexts This command lists the number of texts in a...
raw
xml
Examples
Load a raw file, tokenize and tag it, and output xml (to stdout):
$ tpy raw your_file.txt
To save to a file
$ tpy raw your_file.txt -o output.xml
# or
$ tpy raw your_file.txt > output.xml
To tag using a specific tagger:
$ tpy raw your_file.txt --tagger=tree # Tags using the tree tagger
To load a Typecraft xml file and tag it:
$ tpy xml your_file.xml --tag --tagger=nltk -o tagged_output.xml
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.1.1 (2016-08-15)
Fixed some small bugs.
0.1.0 (2016-08-14)
- First release. Added main bulk of initial code:
Parser works in its most basic inception and parses TC-XML documents into its object-tree
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for typecraft_python-0.11.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ccf81b25a1bdc67a89197f7713180d6e240025b0430e2f0e556d604d469c68b7 |
|
MD5 | 82c57c9adac5fb67eec103d07a35e848 |
|
BLAKE2b-256 | 9036d09ccb42df9940426c4f94494a714734218fa6d360f27b20d7df8fef9756 |