Various functions to manipulate CONLL files
Project description
CONLL Transform - functions to manipulate CONLL data
This package constains several functions to manipulate conll data:
read_files: Read one or several conll files and return a dictionary of documents.read_file: Read a conll file and return dictionary of documents.write_file: Write a conll file.compute_mentions: Compute mentions from the raw last column of the conll file.compute_chains: Compute and return the chains from the conll data.sentpos2textpos: Transform mentions[SENT, START, STOP]to[TEXT_START, TEXT_STOP].textpos2sentpos: Transform mentions[TEXT_START, TEXT_STOP]to[SENT, START, STOP].write_chains: Convert a list of chains to a conll coreference column.replace_coref_col: Replace the last column oftar_docsby the last column ofsrc_docs.remove_singletons: Remove the singletons of the conll fileinfpath, and write the version without singleton in the conll fileoutfpath.filter_pos: Filter mentions that have POS in unwanted_pos, return a new mention list.check_no_duplicate_mentions: Return True if there is no duplicate mentions.merge_boundaries: Add the mentions ofboundary_docstocoref_docsif they don't already exist, as singletons.remove_col: Remove columns from all tokens in docs.write_mentions: Opposite forcompute_mentions(). Write the last column insent.compare_coref_cols: Build a conll file that merge the corefcols of several other files.to_corefcol: Write the conll fileoutfpathwith just the last column (coref) of the conll fileinfpath.get_conll_2012_key_pattern: Return a compiled pattern object to match conll2012 key format.merge_amalgams: Add amalgams in documents from where they have been removed.
To use it, just import the function from conll_transform, for example:
from conll_transform import read_files
documents = read_files("myfile.conll", "myfile2.conll")
print(documents)
The source can be found at GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
conll_transform-0.1.0.tar.gz
(8.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file conll_transform-0.1.0.tar.gz.
File metadata
- Download URL: conll_transform-0.1.0.tar.gz
- Upload date:
- Size: 8.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26ed32f55f20aef06b39a4af40360ea30948491c6f9daf206e85d8a995e0c395
|
|
| MD5 |
118cf0a8c19b1f1f7c21403c484f3c0c
|
|
| BLAKE2b-256 |
c51141a298bbce94cc05f607766666ffbaccaec6ba5f8ff4973afb3f191a359e
|
File details
Details for the file conll_transform-0.1.0-py3-none-any.whl.
File metadata
- Download URL: conll_transform-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c06c97b3cb25673d40b66ed25c91ecb91adbb4aae4c229fecda0bfe22ce5d324
|
|
| MD5 |
1c9fea99b60512d8a251e23b2a6d7724
|
|
| BLAKE2b-256 |
28670fcd538ffc5622029dc9c42060d9fba2656b2c74d6a10b376e54bee3bae0
|