Skip to main content

A Python package for computing krippendorrfs alpha for graph (modified from https://github.com/grrrr/krippendorff-alpha/blob/master/krippendorff_alpha.py)

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

Krippendorrf-alpha-for-graph

Compute Krippendorrf's alpha for graph, modified from https://github.com/grrrr/krippendorff-alpha/

Changes

  1. Used Networkx to instantiate graph
  2. Added custom node/edge and graph metrics (see below)
  3. Forced a pre-computation of distance matrix to boost efficiency for computing, and store it as .npy
    • within-units disagreement (Do)
    • within- and between-units expected total disagreement (De)
  4. Not properly tested, but as long as you have a pandas dataframe that satisfies the following shape, it works.
    • the df has a feature column storing annotated graphs (list of tuples, such as [("subject_1", "predicate_1", "object_1"), ("subject_2", "predicate_2", "object_2")])
    • feature column can also be nodes or edges (tuple of strings)
    • a column indicating annotator id
    • annotation id is ordered the same way for all annotator
  5. Note that, distance metric interacts with the networkx graph type when calling instantiate_networkx_graph(). There are the following graph types,
    • nx.Graph
    • nx.DiGraph
    • nx.MultiGraph
    • nx.MultiDiGraph
  6. Two categories of distance metric are implemented.
    • Lenient metric: node/edge or graph overlap
    • Strict metric: nominal metric, graph edit distance
  7. Depending on your how many graphs you have, computation of graph distance matrix can take a long time.

Node/edge Metrics

Lenient metric

  1. Node overlap metric: if two sets of nodes or edges overlap, the distance between these two sets is 0; else 1.

Strict metric

  1. Nominal metric: exact match of two sets of ndoes or edges.

Graph Metrics

Lenient metric

  1. Graph overlap metric: if two graphs overlap, the distance between these two sets is 0; else 1.

Strict metric

  1. Normalized graph edit distance
    • normalized by computing distance between g1 and g0 and between g2 and g0
    • g0 is an empty graph

Example Usage

Compute distance matrix of graphs
from krippendorrf_graph import compute_alpha, compute_distance_matrix, graph_edit_distance, graph_overlap_metric, nominal_metric

data = [
    df[df["annotator"]==1].graph_feature.to_list(),
    df[df["annotator"]==2].graph_feature.to_list(),
    df[df["annotator"]==3].graph_feature.to_list(),
    df[df["annotator"]==4].graph_feature.to_list()
]

empty_graph_indicator = "*" # indicator for missing values
feature_column="graph_feature"
save_path = "./lenient_distance_matrix.npy"
graph_distance_metric= node_overlap_metric
forced = True

if not Path(save_path).exists() or forced:
    distance_matrix = compute_distance_matrix(df_task2_annotation, feature_column=feature_column, graph_distance_metric=graph_distance_metric, 
                                              empty_graph_indicator=empty_graph_indicator, save_path=save_path, graph_type=nx.Graph)
else: 
    distance_matrix = np.load(save_path)
    
print("Lenient node metric: %.3f" % compute_alpha(data, distance_matrix=distance_matrix, missing_items=empty_graph_indicator))

(Please help contributing by making a PR - it will be faster than reporting an issue since the maintainer might be slower than you.)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

krippendorrf_graph-0.2.0.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

krippendorrf_graph-0.2.0-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file krippendorrf_graph-0.2.0.tar.gz.

File metadata

  • Download URL: krippendorrf_graph-0.2.0.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for krippendorrf_graph-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e0d639dd165651c7ba45199704b0e887481e03f8995702ddacd9f904b96c699b
MD5 40aa09745aa3b6afc1266d85d672c505
BLAKE2b-256 7858ac5f7aaddef152ab38bdbd267baa734bd0b4cd027a39d382245389eaa12c

See more details on using hashes here.

File details

Details for the file krippendorrf_graph-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for krippendorrf_graph-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bf5a44819369b0501fbfc2b374d5493ebb642d2606df97f7673e9ef1fcd969e3
MD5 fb0cc81de1ebca3584a10aaa87345ba6
BLAKE2b-256 13e8ea130533b8b3ff23eb5660114f18826f5140fd2f18807a56905cf9c00179

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page