A Python package for graph processing

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

SMATCH++

The package focuses on handy processing of AMR graphs with a special focus on standardized evaluation of AMR graph parsers with Smatch (structural matching of graphs). A short overview of some features:

Simple AMR reading, AMR writing, different syntactic and semantic AMR standardization options
Alignment solvers including optimal ILP alignment, and optional graph compression
Evaluation scoring with bootstrap confidence intervals, micro and macro averages
AMR-targeted subgraph extraction and extended scoring for spatial, temporal, causal, and more meaning aspects

Jump directly to parser evaluation best practices or (new) pip install to use smatch++ and its options simply from within your python program. The following text also gives an overview over some options of Smatch++.

Requirements

For the most basic version, there shouldn't be a need to install additional modules. However, when using ilp optimal solving and bootstrapping, we require

mip (tested: 1.13.0)
scipy (tested: 1.7.3)
numpy (tested: 1.20.1)

The packages can all be installed with pip ...

Example configurations

Recommended for average case: ILP alignment, dereification, corpus metrics and confidence intervals

Efficiency: +
Optimality: +++
Graph standardization: ++

Simply call:

./score.sh <amrs1> <amrs2>

where <amrs1> and <amrs2> are the paths to the files with graphs. Format is assumed to be in "penman":

# first graph
(x / y
   :rel (w / z))

# second graph
(...

Or can set to tsv with -input_format tsv, where the file looks like:

# first graph
x y nodelabel
w z nodelabel
x w rel

# second graph
...

Hill-climber alignment, dereification, corpus metrics and confidence intervals

Efficiency: ++
Optimality: +
Graph standardization: ++

python -m smatchpp      -a <amrs1> \
			-b <amrs2> \
			-solver hillclimber \
			-edges dereify \
			-score_dimension main \
			-score_type micromacro \
			-log_level 20 \
			--bootstrap \
			--remove_duplicates

Fast ILP with graph compression, corpus metrics and confidence intervals

Efficiency: ++
Optimality: +++
Graph standardization: +

python -m smatchpp      -a <amrs1> \
			-b <amrs2> \
			-solver ilp \
			-edges dereify \
			-score_dimension main \
			-score_type micromacro \
			-log_level 20 \
			--bootstrap \
			--remove_duplicates \
			--lossless_graph_compression

ILP with reification, corpus metrics and confidence intervals

Efficiency: -
Optimality: +++
Graph standardization: +++

python -m smatchpp      -a <amrs1> \
			-b <amrs2> \
			-solver ilp \
			-edges reify \
			-score_dimension main \
			-score_type micromacro \
			-log_level 20 \
			--bootstrap \
			--remove_duplicates \

ILP alignment, corpus sub-aspect metrics and confidence intervals

Efficiency: +
Optimality: +++
Graph standardization: ++

python -m smatchpp      -a <amrs1> \
			-b <amrs2> \
			-solver ilp \
			-edges dereify \
			-score_dimension all-multialign \
			-score_type micromacro \
			-log_level 20 \
			--bootstrap \
			--remove_duplicates \

Other configurations

See

python -m smatchpp --help

Additional functionality

Custom triple matching

Can be implemented in score.py

Changing subgraph metrics

See subgraph_extraction.py

Pip install

Pip installation

Simply run

pip install smatchpp

The main interface is a smatchpp.Smatchpp object. With this, most kinds of operations can be performed on graphs and pairs of graphs. Some examples are in the following,

Example I: Smatch++ matching with some basic default

import smatchpp
measure = smatchpp.Smatchpp()
match, optimization_status, alignment = measure.process_pair("(t / test)", "(t / test)")
print(match) # [2, 2, 2, 2], 2 left->right, 2 in right->left, 2 length of left, 2 length of right

Note: Here it's two triples matching since there is an implicit root.

Example II: Standardize and extract subgraphs

import smatchpp
measure = smatchpp.Smatchpp()
string_graph = "(c / control-01 :arg1 (c2 / computer) :arg2 (m / mouse))"
g = measure.graph_reader.string2graph(string_graph)
g = measure.graph_standardizer.standardize(g)
name_subgraph_dict = measure.subgraph_extractor.all_subgraphs_by_name(g)

# get subgraph for "instrument"
print(name_subgraph_dict["INSTRUMENT"]) # [(c, instance, control-01), (m, instance, mouse), (c, instrument, m)]

Note that the result is the same as when we mention the instrument edge explicitly, i.e., string_graph = "(c / control-01 :arg1 (c2 / computer) :instrument (m / mouse))". Such a semantic standarization can also be performed on a full graph by loading an explicit standardizer (here without subgraph extraction), which explicates core-roles, if possible:

from smatchpp import data_helpers, preprocess
graph_reader = data_helpers.PenmanReader()
graph_writer = data_helpers.PenmanWriter()
graph_standardizer = preprocess.AMRGraphStandardizer(semantic_standardization=True)
string_graph = "(c / control-01 :arg1 (c2 / computer) :arg2 (m / mouse))"
g = graph_reader.string2graph(string_graph)
g = graph_standardizer.standardize(g)
print(g) # [('c', ':instrument', 'm'), ('c', ':instance', 'control-01'), ('c1', ':instance', 'computer'), ('m', ':instance', 'mouse'), ('c', ':arg1', 'c1'), ('c', ':root', 'control-01')]

Example III: Smatch++ matching same as default but with ILP

In this example, we use ILP for optimal alignment.

import smatchpp, smatchpp.solvers
ilp = smatchpp.solvers.ILP()
measure = smatchpp.Smatchpp(alignmentsolver=ilp)
match, optimization_status, alignment = measure.process_pair("(t / test)", "(t / test)")
print(match) # in this case same result as Example I

Example IV: get an alignment

In this example, we retrieve an alignment between graph nodes.

import smatchpp
measure = smatchpp.Smatchpp()
measure.graph_standardizer.relabel_vars = False
s1 = "(x / test)"
s2 = "(y / test)"
g1 = measure.graph_reader.string2graph(s1)
g1 = measure.graph_standardizer.standardize(g1)
g2 = measure.graph_reader.string2graph(s2)
g2 = measure.graph_standardizer.standardize(g2)
g1, g2, v1, v2 = measure.graph_pair_preparer.prepare_get_vars(g1, g2)
alignment, var_index, _ = measure.graph_aligner.align(g1, g2, v1, v2)
var_map = measure.graph_aligner._get_var_map(alignment, var_index)
interpretable_mapping = measure.graph_aligner._interpretable_mapping(var_map, g1, g2)
print(interpretable_mapping) # prints [[('aa_x_test', 'bb_y_test')]], where aa/bb indicates 1st/2nd graph

Note that the alignment is a by-product of the matching and can be also retrieved in simpler ways (here we show the process from scratch).

Example V: Read, standardize and write graph

In this example, we read a basic graph from a string, apply reification standardization, and write the reified graph to a string.

from smatchpp import data_helpers, preprocess
graph_reader = data_helpers.PenmanReader()
graph_writer = data_helpers.PenmanWriter()
graph_standardizer = preprocess.AMRGraphStandardizer(edges="reify")
s = "(t / test :mod (s / small :mod (v / very)) :quant 2 :op v)"
g = graph_reader.string2graph(s)
g = graph_standardizer.standardize(g)
string = graph_writer.graph2string(g)
print(string) # (t / test :op (v / very :arg2-of (ric5 / have-mod-91 :arg1 (s / small :arg2-of (ric3 / have-mod-91 :arg1 t)))) :arg1-of (ric6 / have-quant-91 :arg2 2))

Citation

If you like the project, consider citing

@inproceedings{opitz-2023-smatch,
    title = "{SMATCH}++: Standardized and Extended Evaluation of Semantic Graphs",
    author = "Opitz, Juri",
    booktitle = "Findings of the Association for Computational Linguistics: EACL 2023",
    month = may,
    year = "2023",
    address = "Dubrovnik, Croatia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-eacl.118",
    pages = "1595--1607"
}

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.5.2

Dec 14, 2023

1.5.1

Dec 5, 2023

1.5.0

Dec 5, 2023

1.4.2

Dec 1, 2023

1.4.1

Dec 1, 2023

1.4.0

Nov 30, 2023

1.3.2

Oct 12, 2023

1.3.1

Oct 9, 2023

1.3.0

Sep 28, 2023

1.2.0

Sep 25, 2023

1.1.3

Aug 21, 2023

1.1.2

Aug 21, 2023

1.1.1

Aug 21, 2023

1.1.0

Aug 16, 2023

1.0.3

Jul 21, 2023

1.0.2

Jul 21, 2023

This version

1.0.1

Jul 21, 2023

1.0.0

Jul 20, 2023

0.1.0

Jun 15, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smatchpp-1.0.1.tar.gz (212.2 kB view hashes)

Uploaded Jul 21, 2023 Source

Hashes for smatchpp-1.0.1.tar.gz

Hashes for smatchpp-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`c31baec8d234e32820f88f1f3c3024a87d7b2c166eba916c9aff15a518fbbd04`
MD5	`fc5aa73e46b7595b0fb81ec2d12c124e`
BLAKE2b-256	`8b68b042cf128e94581b21f9eab4e1905aa94bc255d7b417aa53460203dd3c04`