Skip to main content

graph-based processing of multi-level annotated corpora

Project description

DiscourseGraphs

This library enables you to process linguistic corpora with multiple levels of annotations by:

  1. converting the different annotation formats into separate graphs and

  2. merging these graphs into a single multidigraph (based on the common tokenization of the annotation layers)

So far, the following formats can be imported and merged:

  • TigerXML (a format for representing tree-like syntax graphs with secondary edges)

  • RS3 (a format used by RSTTool to annotate documents with Rhetorical Structure Theory)

  • an ad-hoc plain text format for annotating expletives (you’re probably not interested in)

Installation

git clone https://github.com/arne-cl/discoursegraphs.git
cd discoursegraphs
python setup.py install # prepend 'sudo' if needed

Requirements

If you’d like to visualize your graphs, you will also need:

License

3-Clause BSD.

Author

Arne Neumann

People who downloaded this also like

  • SaltNPepper (a converter framework for various linguistic data formats)

News

0.1

Release date: 24-Apr-2014

  • first public release

  • imports: RS3, TigerXML and an ad-hoc format for expletive annotation

  • merge these formats/files into a single multidigraph

  • generates simple dot/graphviz-based visualization

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

discoursegraphs-0.1.0.tar.gz (17.9 kB view details)

Uploaded Source

File details

Details for the file discoursegraphs-0.1.0.tar.gz.

File metadata

File hashes

Hashes for discoursegraphs-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ed5c17bf180f94dced310f4571b5b463e4130e54ca87575eee0d84949cee812f
MD5 ba4e582bb3f667eadbe95c8782754ee1
BLAKE2b-256 98bbc3d3c81606dd0521176d19933541d23f7647d34b98eb7c715ced2faefb60

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page