package for the paper Syntax Aware Natural Language Inference@<link>
Project description
SynNLI
Description
- this repo uses allennlp as base repo
AllenNlp
- a quick guide of mine can be found at the same folder
- for insight, please visit allennlp document and github
Custom Classes and Operations
GraphPair2VecEncoder- 'gen', 'gmn'
Graph2GraphEncoder- known as
graph convolution layerinpytorch_geometric
- known as
GraphPair2GraphPairEncoder- for graph matching in sparse batch
- tf.dynamic_partition + normal attention
NodeUpdater- A wrapper over
RNNs
- A wrapper over
Graph2VecEncoder- known as
global pooling layerinpytorch_geometric - 'global_attention'
- known as
SynNLIModel(base=Model)- use
Embedderto embed input - use
GraphPair2VecEncoderto get compare vector for classifier to make final decision
- use
tensor_op.py- batch conversion between normal model and graph model
- sparse2dense
- dense2sparse
- batch conversion between normal model and graph model
SparseAdjacencyField- cooperate with
pytorch_geometricto get sparce graph batch - see
batch_tensors()andas_tensor()for the key of implementation
- cooperate with
NLIGraphReader- read graph input (parsed by
Stanza)
- read graph input (parsed by
preprocess.py- see the
Preprocesssection for detail
- see the
configs- can be found in
src/training - for allennlp train
- can be found in
Usage (Cur)
- ./install_dependencies.sh
- download NLI style data set to data
- and specify path in jsonne
- parse data (see Parse Data section)
- and specify path in jsonnet
- train model (see Training Area)
- with jsonnet
Parse Data with Stanza
- Stanza will be loaded in preprocess.py
- the parser version is the one @ 2020/8/22
- use preprocess.py
python preprocess.py -i <raw_data_path> \
-o <target_path> \
--files <file_names> \
--force(if activated, force execution when <target_path exists>) \
-m 10(if provided, maximum instances to process is set, this is mainly for testing)
# example
python preprocess.py -i ../data/anli_v1.0/R2/ \
-o ../data/anli_v1.0_preprocessed/R2/ \
--files dev.jsonl test.jsonl train.jsonl \
--force \
-m 10
- if want to use allennlp (less recommended)
- download allennlp dependency parser and SRL labeler from path
Training
- refer to "the config.jsonnet"
allennlp train "./src_gmn/training_config.jsonnet" -s "./param/testv1" --include-package "package_v1" --force
Future Supported Usage
- pip install -r requirements
-
- add configs folder for various config
- note that should take lemmatized as node attr if use word level embedding(or + char embedding to ease)
- root to spetial token
- use MLP prjection
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
syn_nli-0.0.3.tar.gz
(42.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
syn_nli-0.0.3-py3-none-any.whl
(66.5 kB
view details)
File details
Details for the file syn_nli-0.0.3.tar.gz.
File metadata
- Download URL: syn_nli-0.0.3.tar.gz
- Upload date:
- Size: 42.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6362b1839d00ac85e8b71d47b926c04a8f646f953fe04fe77553f0d5d42167b7
|
|
| MD5 |
1262ebf1eee98d9dab4c2f4421a243dc
|
|
| BLAKE2b-256 |
8907e6cf65ea3302f11fd63f0d6a62981eca16dd5167a32490d35784d3a0d341
|
File details
Details for the file syn_nli-0.0.3-py3-none-any.whl.
File metadata
- Download URL: syn_nli-0.0.3-py3-none-any.whl
- Upload date:
- Size: 66.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ec83b441eb38829e9d8ae3041090ea22d9209af8772c7c899ecbf4589e2c3f8
|
|
| MD5 |
2705d96abfa46804103dbd9c93fb8471
|
|
| BLAKE2b-256 |
87bb6f9b1fe1e7a2555aa4ad1e07ed339954d20664771b2389ca9ddf16b19620
|