Unofficial Package of AMR Parsing as Sequence-to-Graph Transduction
Project description
AMR Parsing as Sequence-to-Graph Transduction
This is an unofficial package for the following research paper, for research experiments only.
Code for the AMR Parser in our ACL 2019 paper "AMR Parsing as Sequence-to-Graph Transduction".
If you find our code is useful, please cite:
@inproceedings{zhang-etal-2018-stog,
title = "{AMR Parsing as Sequence-to-Graph Transduction}",
author = "Zhang, Sheng and
Ma, Xutai and
Duh, Kevin and
Van Durme, Benjamin",
booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2019",
address = "Florence, Italy",
publisher = "Association for Computational Linguistics"
}
1. Environment Setup
The code has been tested on Python 3.6 and PyTorch 0.4.1. All other dependencies are listed in requirements.txt.
Via conda:
conda create -n stog python=3.6
source activate stog
pip install -r requirements.txt
2. Data Preparation
Download Artifacts:
./scripts/download_artifacts.sh
Assuming that you're working on AMR 2.0 (LDC2017T10),
unzip the corpus to data/AMR/LDC2017T10
, and make sure it has the following structure:
(stog)$ tree data/AMR/LDC2017T10 -L 2
data/AMR/LDC2017T10
├── data
│ ├── alignments
│ ├── amrs
│ └── frames
├── docs
│ ├── AMR-alignment-format.txt
│ ├── amr-guidelines-v1.2.pdf
│ ├── file.tbl
│ ├── frameset.dtd
│ ├── PropBank-unification-notes.txt
│ └── README.txt
└── index.html
Prepare training/dev/test data:
./scripts/prepare_data.sh -v 2 -p data/AMR/LDC2017T10
3. Feature Annotation
We use Stanford CoreNLP (version 3.9.2) for lemmatizing, POS tagging, etc.
First, start a CoreNLP server following the API documentation.
Then, annotate AMR sentences:
./scripts/annotate_features.sh data/AMR/amr_2.0
4. Data Preprocessing
./scripts/preprocess_2.0.sh
5. Training
Make sure that you have at least two GeForce GTX TITAN X GPUs to train the full model.
python -u -m stog.commands.train params/stog_amr_2.0.yaml
6. Prediction
python -u -m stog.commands.predict \
--archive-file ckpt-amr-2.0 \
--weights-file ckpt-amr-2.0/best.th \
--input-file data/AMR/amr_2.0/test.txt.features.preproc \
--batch-size 32 \
--use-dataset-reader \
--cuda-device 0 \
--output-file test.pred.txt \
--silent \
--beam-size 5 \
--predictor STOG
7. Data Postprocessing
./scripts/postprocess_2.0.sh test.pred.txt
8. Evaluation
Note that the evaluation tool works on python2
, so please make sure python2
is visible in your $PATH
.
./scripts/compute_smatch.sh test.pred.txt data/AMR/amr_2.0/test.txt
Pre-trained Models
Here are pre-trained models: ckpt-amr-2.0.tar.gz and ckpt-amr-1.0.tar.gz. To use them for prediction, simply download & unzip them, and then run Step 6-8.
In case that you only need the pre-trained model prediction (i.e., test.pred.txt
), you can find it in the download.
Acknowledgements
We adopted some modules or code snippets from AllenNLP, OpenNMT-py and NeuroNLP2. Thanks to these open-source projects!
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file unofficial_stog-0.0.21.tar.gz
.
File metadata
- Download URL: unofficial_stog-0.0.21.tar.gz
- Upload date:
- Size: 220.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.25.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e9f1ef19697f96b6e702c5ae26021393c0f33aa3e4debf747d0c4a2ac2861fa |
|
MD5 | 8e4f00fa56ed7c5d89612f273165a836 |
|
BLAKE2b-256 | f62773a9ce189108fa815176e47c67fd6a377710e039b5fd4bd2627c2ba35ba5 |