Skip to main content

A set of utilities for interacting with Penn-Treebank .mrg-formatted parses and identifying syntactic heads

Project description

mrg_utils.py Created by Robert Elwell University of Texas at Austin Department of Linguistics http://comp.ling.utexas.edu/relwell

Licensed under GPL

This is a set of python classes for processing Penn-Treebank-style combined parses, also known as the .mrg format in PTB release two. Files should be fairly self-explanatory.

Canonical node is mrg_utils.py, but mrg_document.py and node.py may be more informative for someone starting out.

This could save you up to a month of writing and debugging, and was designed to be scalable.

You can use this to extract features, easily run statistics, and navigate syntactic trees.

This code is built from an API originally designed to interface with Stanford Parser-style dependency parse outputs (Marneffe et al, 2006), Penn Discourse Treebank data, and more. Code or guidance will be furnished upon request by emailing me at robert.elwell@gmail.com.

Good luck, and enjoy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mrg_utils-0.0.1.tar.gz (8.9 kB view details)

Uploaded Source

File details

Details for the file mrg_utils-0.0.1.tar.gz.

File metadata

  • Download URL: mrg_utils-0.0.1.tar.gz
  • Upload date:
  • Size: 8.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for mrg_utils-0.0.1.tar.gz
Algorithm Hash digest
SHA256 a85cadfb734312d1f61993132da2145f6e56b28022245fa8d235bfd814151bb7
MD5 3f50fbc0524608b255eb5056b4901436
BLAKE2b-256 7f6be84f1f4d519477e9b69f82ab8a53003d45d67e1536a797dcf4cbc472bc1f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page