Python library for managing and annotating textual corpus using TextTagLib (TTL) format
Project description
Python library for managing and annotating textual corpus using TextTagLib (TTL) format
Installation
texttaglib is availble on PyPI.
pip install texttaglib
# or more explicit
python3 -m pip install texttaglib
Basic usage
>>> from texttaglib import ttl
>>> doc = ttl.Document('mydoc')
>>> sent = doc.new_sent("I am a sentence.")
>>> sent
#1: I am a sentence.
>>> sent.ID
1
>>> sent.text
'I am a sentence.'
>>> sent.import_tokens(["I", "am", "a", "sentence", "."])
>>> >>> sent.tokens
[`I`<0:1>, `am`<2:4>, `a`<5:6>, `sentence`<7:15>, `.`<15:16>]
>>> doc.write_ttl()
The script above will generate this corpus
-rw-rw-r--. 1 tuananh tuananh 0 3月 29 13:10 mydoc_concepts.txt -rw-rw-r--. 1 tuananh tuananh 0 3月 29 13:10 mydoc_links.txt -rw-rw-r--. 1 tuananh tuananh 20 3月 29 13:10 mydoc_sents.txt -rw-rw-r--. 1 tuananh tuananh 0 3月 29 13:10 mydoc_tags.txt -rw-rw-r--. 1 tuananh tuananh 58 3月 29 13:10 mydoc_tokens.txt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
texttaglib-0.1a1.tar.gz
(3.5 kB
view details)
File details
Details for the file texttaglib-0.1a1.tar.gz
.
File metadata
- Download URL: texttaglib-0.1a1.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad30b98e891aad1ff3a2d2569e7f8b74457c8f8258f012304a22a12ba2e50e05 |
|
MD5 | fa833474e6536d1795496e990b10e4ce |
|
BLAKE2b-256 | 258c6f71012b4fdce2dbdfa49c0c29addb094d954d82b0c6b10b0be3be0864d9 |