Lightweight implementation of the Typecraft XML format in python.
Project description
Typecraft Python
This repository contains an IGT model based on the Typecraft IGT format. It also contains a simple CLI for performing various NLP tasks, interfacing with both NLTK and other tools such as the TreeTagger.
Free software: MIT license
Full Documentation: https://typecraft_python.readthedocs.io.
Installation
pip install typecraft_python
Features
Parsing of the Typecraft XML format.
- Manipulation of the Typecraft IGT model format.
Integrating with NLTK
Integrating with TreeTagger
Provides a CLI that can be used to load, convert and manipulate raw text and Typecraft XML files.
Usage
Usage: tpy [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
convert
ntexts This command lists the number of texts in a...
raw
xml
Examples
Load a raw file, tokenize and tag it, and output xml (to stdout):
$ tpy raw your_file.txt
To save to a file
$ tpy raw your_file.txt -o output.xml
# or
$ tpy raw your_file.txt > output.xml
To tag using a specific tagger:
$ tpy raw your_file.txt --tagger=tree # Tags using the tree tagger
To load a Typecraft xml file and tag it:
$ tpy xml your_file.xml --tag --tagger=nltk -o tagged_output.xml
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.1.1 (2016-08-15)
Fixed some small bugs.
0.1.0 (2016-08-14)
- First release. Added main bulk of initial code:
Parser works in its most basic inception and parses TC-XML documents into its object-tree
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for typecraft_python-0.10.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab21a65e47cc1d5090dc31873a4e550df20ad2f1c9513347b5f246e8eea17b07 |
|
MD5 | abba3c8187caa712b7df86918c3029a5 |
|
BLAKE2b-256 | f1320d5c0f27ef8fafecd0172a88bd90d7e03e9c53de208ac96d585dd2c133ec |