Adapts amrlib in the Zensols framework.
Project description
AMR Annotation and Feature Generation
Provides support for AMR annotations and feature generation.
Features:
- Annotation in AMR metadata. For example, sentence types found in the Proxy report AMR corpus.
- AMR token alignment as spaCy components.
- A scoring API that includes Smatch and WLK, which extends a more general NLP scoring module.
- AMR parsing (amrlib) and AMR co-reference (amr_coref).
- Command line and API utilities for AMR graph Penman graphs, debugging and files.
- Tools for training and evaluating AMR parse models.
Documentation
Obtaining
The easiest way to install the command line program is via the pip
installer:
pip3 install zensols.amr
Binaries are also available on pypi.
Usage
from penman.graph import Graph
from zensols.nlp import FeatureDocument, FeatureDocumentParser
from zensols.amr import AmrDocument, AmrSentence, ApplicationFactory
sent: str = """
He was George Washington and first president of the United States.
He was born On February 22, 1732.
""".replace('\n', ' ').strip()
# get the AMR document parser
doc_parser: FeatureDocumentParser = ApplicationFactory.get_doc_parser()
# the parser creates a NLP centric feature document as provided in the
# zensols.nlp package
doc: FeatureDocument = doc_parser(sent)
# the AMR object graph data structure is provided in the feature document
amr_doc: AmrDocument = doc.amr
# dump a human readable output of the AMR document
amr_doc.write()
# get the first AMR sentence instance
amr_sent: AmrSentence = amr_doc.sents[0]
print('sentence:')
print(' ', amr_sent.text)
print('tuples:')
# show the Penman graph representation
pgraph: Graph = amr_sent.graph
print(f'variables: {", ".join(pgraph.variables())}')
for t in pgraph.triples:
print(' ', t)
print('edges:')
for e in pgraph.edges():
print(' ', e)
Per the example, the t5.conf and
gsii.conf configuration show how to include
configuration needed per AMR model. These files can also be used directly with
the amr
command using the --config
option.
However, the other resources in the example must be imported unless you redefine them yourself.
Library
When adding the amr
spaCy pipeline component, the doc._.amr
attribute is
set on the Doc
instance. You can either configure spaCy yourself, or you can
use the configuration files in test-resources as an example
using the zensols.util configuration framework. The command line application
provides an example how to do this, along with the test
case.
Command Line
This library is written mostly to be used by other program, but the command
line utility amr
is also available to demonstrate its usage and to generate
ARM graphs on the command line.
To parse:
$ amr parse -c test-resources/t5.conf 'This is a test of the AMR command line utility.'
# ::snt This is a test of the AMR command line utility.
(t / test-01
:ARG1 (u / utility
:mod (c / command-line)
:name (n / name
:op1 "AMR"
:toki1 "6")
:toki1 "9")
:domain (t2 / this
:toki1 "0")
:toki1 "3")
To generate graphs in PDF format:
$ amr plot -c test-resources/t5.conf 'This is a test of the AMR command line utility.'
wrote: amr-graph/this-is-a-test-of-the-amr-comm.pdf
Performance (Smatch)
This repo is configured to download and train on the AMR bio-medical corpus. The results of the scores using amrlib's default smatch score is:
Corpus | Model | Precision | Recall | F-score |
---|---|---|---|---|
bio | amrlib t5 | 0.5613647022821542 | 0.4799029769470724 | 0.5174473330001563 |
bio | amrlib t5 + bio | 0.6792187759112143 | 0.6164372669678633 | 0.6463069704295633 |
Attribution
This project, or reference model code, uses:
- Python 3
- spaCy for natural language parsing.
- zensols.nlparse for natural language features.
- amrlib for AMR parsing.
- amr_coref for AMR co-reference
- Smatch (Cai and Knight. 2013) and WLK (Opitz et. al. 2021) for scoring.
Citation
If you use this project in your research please use the following BibTeX entry:
@article{Landes_DiEugenio_Caragea_2021,
title={DeepZensols: Deep Natural Language Processing Framework},
url={http://arxiv.org/abs/2109.03383},
note={arXiv: 2109.03383},
journal={arXiv:2109.03383 [cs]},
author={Landes, Paul and Di Eugenio, Barbara and Caragea, Cornelia},
year={2021},
month={Sep}
}
Changelog
An extensive changelog is available here.
License
Copyright (c) 2021 - 2023 Paul Landes
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for zensols.amr-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1bbc74c4f5bfd8071a80ec66f9a2171faf32f53c08098de8252fa45695e9f5a |
|
MD5 | d388a45f520e047930eedf197337891f |
|
BLAKE2b-256 | 259532b20da72ef0888368ab77203394981186248b4d3b302f5bbd0fd8342e6b |