graph-based processing of multi-level annotated corpora
Project description
DiscourseGraphs
This library enables you to process linguistic corpora with multiple levels of annotations by:
converting the different annotation formats into separate graphs and
merging these graphs into a single multidigraph (based on the common tokenization of the annotation layers)
So far, the following formats can be imported and merged:
TigerXML (a format for representing tree-like syntax graphs with secondary edges)
RS3 (a format used by RSTTool to annotate documents with Rhetorical Structure Theory)
an ad-hoc plain text format for annotating expletives (you’re probably not interested in)
Installation
git clone https://github.com/arne-cl/discoursegraphs.git cd discoursegraphs python setup.py install # prepend 'sudo' if needed
Requirements
If you’d like to visualize your graphs, you will also need:
License
3-Clause BSD.
People who downloaded this also like
SaltNPepper (a converter framework for various linguistic data formats)
News
0.1
Release date: 24-Apr-2014
first public release
imports: RS3, TigerXML and an ad-hoc format for expletive annotation
merge these formats/files into a single multidigraph
generates simple dot/graphviz-based visualization
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file discoursegraphs-0.1.0.tar.gz.
File metadata
- Download URL: discoursegraphs-0.1.0.tar.gz
- Upload date:
- Size: 17.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed5c17bf180f94dced310f4571b5b463e4130e54ca87575eee0d84949cee812f
|
|
| MD5 |
ba4e582bb3f667eadbe95c8782754ee1
|
|
| BLAKE2b-256 |
98bbc3d3c81606dd0521176d19933541d23f7647d34b98eb7c715ced2faefb60
|