Various functions to manipulate CONLL files
Project description
CONLL Transform - functions to manipulate CONLL data
This package constains several functions to manipulate conll data:
read_files: Read one or several conll files and return a dictionary of documents.read_file: Read a conll file and return dictionary of documents.write_file: Write a conll file.compute_mentions: Compute mentions from the raw last column of the conll file.compute_chains: Compute and return the chains from the conll data.sentpos2textpos: Transform mentions[SENT, START, STOP]to[TEXT_START, TEXT_STOP].textpos2sentpos: Transform mentions[TEXT_START, TEXT_STOP]to[SENT, START, STOP].write_chains: Convert a list of chains to a conll coreference column.replace_coref_col: Replace the last column oftar_docsby the last column ofsrc_docs.remove_singletons: Remove the singletons of the conll fileinfpath, and write the version without singleton in the conll fileoutfpath.filter_pos: Filter mentions that have POS in unwanted_pos, return a new mention list.check_no_duplicate_mentions: Return True if there is no duplicate mentions.merge_boundaries: Add the mentions ofboundary_docstocoref_docsif they don't already exist, as singletons.remove_col: Remove columns from all tokens in docs.write_mentions: Opposite forcompute_mentions(). Write the last column insent.compare_coref_cols: Build a conll file that merge the corefcols of several other files.to_corefcol: Write the conll fileoutfpathwith just the last column (coref) of the conll fileinfpath.get_conll_2012_key_pattern: Return a compiled pattern object to match conll2012 key format.merge_amalgams: Add amalgams in documents from where they have been removed.
To use it, just import the function from conll_transform, for example:
from conll_transform import read_files
documents = read_files("myfile.conll", "myfile2.conll")
print(documents)
The source can be found at GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
conll_transform-0.1.0.tar.gz
(8.4 kB
view hashes)
Built Distribution
Close
Hashes for conll_transform-0.1.0-py3-none-any.whl
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 | c06c97b3cb25673d40b66ed25c91ecb91adbb4aae4c229fecda0bfe22ce5d324 |
|
| MD5 | 1c9fea99b60512d8a251e23b2a6d7724 |
|
| BLAKE2b-256 | 28670fcd538ffc5622029dc9c42060d9fba2656b2c74d6a10b376e54bee3bae0 |