Skip to main content

A set of utilities for interacting with Penn-Treebank .mrg-formatted parses and identifying syntactic heads

Project description Created by Robert Elwell University of Texas at Austin Department of Linguistics

Licensed under GPL

This is a set of python classes for processing Penn-Treebank-style combined parses, also known as the .mrg format in PTB release two. Files should be fairly self-explanatory.

Canonical node is, but and may be more informative for someone starting out.

This could save you up to a month of writing and debugging, and was designed to be scalable.

You can use this to extract features, easily run statistics, and navigate syntactic trees.

This code is built from an API originally designed to interface with Stanford Parser-style dependency parse outputs (Marneffe et al, 2006), Penn Discourse Treebank data, and more. Code or guidance will be furnished upon request by emailing me at

Good luck, and enjoy.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mrg_utils-0.0.1.tar.gz (8.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page