Skip to main content

Adapts amrlib in the Zensols framework.

Project description

AMR Annotation and Feature Generation

PyPI Python 3.7 Python 3.8 Python 3.9 Build Status

Provides support for AMR annotations and feature generation.

Features:

  • Annotation in AMR metadata. For example, sentence types found in the Proxy report AMR corpus.
  • AMR token alignment as spaCy components.
  • A scoring API that includes Smatch and WLK, which extends a more general NLP scoring module.
  • AMR parsing (amrlib) and AMR co-reference (amr_coref).
  • Command line and API utilities for AMR graph Penman graphs, debugging and files.
  • Tools for training and evaluating AMR parse models.

Documentation

Obtaining

The easiest way to install the command line program is via the pip installer:

pip3 install zensols.amr

Binaries are also available on pypi.

Usage

from penman.graph import Graph
from zensols.nlp import FeatureDocument, FeatureDocumentParser
from zensols.amr import AmrDocument, AmrSentence, ApplicationFactory

sent: str = """

He was George Washington and first president of the United States.
He was born On February 22, 1732.

""".replace('\n', ' ').strip()
# get the AMR document parser
doc_parser: FeatureDocumentParser = ApplicationFactory.get_doc_parser()
# the parser creates a NLP centric feature document as provided in the
# zensols.nlp package
doc: FeatureDocument = doc_parser(sent)
# the AMR object graph data structure is provided in the feature document
amr_doc: AmrDocument = doc.amr
# dump a human readable output of the AMR document
amr_doc.write()
# get the first AMR sentence instance
amr_sent: AmrSentence = amr_doc.sents[0]
print('sentence:')
print(' ', amr_sent.text)
print('tuples:')
# show the Penman graph representation
pgraph: Graph = amr_sent.graph
print(f'variables: {", ".join(pgraph.variables())}')
for t in pgraph.triples:
    print(' ', t)
print('edges:')
for e in pgraph.edges():
    print(' ', e)

Per the example, the t5.conf and gsii.conf configuration show how to include configuration needed per AMR model. These files can also be used directly with the amr command using the --config option.

However, the other resources in the example must be imported unless you redefine them yourself.

Library

When adding the amr spaCy pipeline component, the doc._.amr attribute is set on the Doc instance. You can either configure spaCy yourself, or you can use the configuration files in test-resources as an example using the zensols.util configuration framework. The command line application provides an example how to do this, along with the test case.

Command Line

This library is written mostly to be used by other program, but the command line utility amr is also available to demonstrate its usage and to generate ARM graphs on the command line.

To parse:

$ amr parse -c test-resources/t5.conf 'This is a test of the AMR command line utility.'
# ::snt This is a test of the AMR command line utility.
(t / test-01
   :ARG1 (u / utility
            :mod (c / command-line)
            :name (n / name
                     :op1 "AMR"
                     :toki1 "6")
            :toki1 "9")
   :domain (t2 / this
               :toki1 "0")
   :toki1 "3")

To generate graphs in PDF format:

$ amr plot -c test-resources/t5.conf 'This is a test of the AMR command line utility.'
wrote: amr-graph/this-is-a-test-of-the-amr-comm.pdf

Performance (Smatch)

This repo is configured to download and train on the AMR bio-medical corpus. The results of the scores using amrlib's default smatch score is:

Corpus Model Precision Recall F-score
bio amrlib t5 0.5613647022821542 0.4799029769470724 0.5174473330001563
bio amrlib t5 + bio 0.6792187759112143 0.6164372669678633 0.6463069704295633

Attribution

This project, or reference model code, uses:

  • Python 3
  • spaCy for natural language parsing.
  • zensols.nlparse for natural language features.
  • amrlib for AMR parsing.
  • amr_coref for AMR co-reference
  • Smatch (Cai and Knight. 2013) and WLK (Opitz et. al. 2021) for scoring.

Citation

If you use this project in your research please use the following BibTeX entry:

@article{Landes_DiEugenio_Caragea_2021,
  title={DeepZensols: Deep Natural Language Processing Framework},
  url={http://arxiv.org/abs/2109.03383},
  note={arXiv: 2109.03383},
  journal={arXiv:2109.03383 [cs]},
  author={Landes, Paul and Di Eugenio, Barbara and Caragea, Cornelia},
  year={2021},
  month={Sep}
}

Changelog

An extensive changelog is available here.

License

MIT License

Copyright (c) 2021 - 2023 Paul Landes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zensols.amr-0.0.1-py3-none-any.whl (76.1 kB view details)

Uploaded Python 3

File details

Details for the file zensols.amr-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: zensols.amr-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 76.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.2

File hashes

Hashes for zensols.amr-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b1bbc74c4f5bfd8071a80ec66f9a2171faf32f53c08098de8252fa45695e9f5a
MD5 d388a45f520e047930eedf197337891f
BLAKE2b-256 259532b20da72ef0888368ab77203394981186248b4d3b302f5bbd0fd8342e6b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page