Skip to main content

Maps nodes and edges of a multi-relational graph to integer

Project description

edgelist-mapper

Build status
Code style Linter Test runner Build tool
Project license

📊 Maps nodes and edges of a multi-relational graph to integer

Synopsis

edgelist-mapper is a simple tool that reads an edge-list file representing a graph and maps each node and relation to integer. The mapping assigned is such that entities and relations that appear more frequently in the graph are mapped to smaller numerical values.

This tool is particularly useful to pre-process some of the publicly available knowledge graph datasets that are often used for the machine learning task of relation prediction.

Input format

The tool takes as input a file (edgelist.tsv) that represents a graph as tab-separated triples of the form (head, relation, tail) and generates three new files, namely mapped_edgelist.tsv, entities_map.tsv, and relations_map.tsv.

san_marino	locatedin	europe
belgium	locatedin	europe
russia	locatedin	europe
monaco	locatedin	europe
croatia	locatedin	europe
poland	locatedin	europe

Example content of the edgelist.tsv file.

0	europe
1	san_marino
2	russia
3	poland
4	monaco
5	croatia
6	belgium

Content of the entities_map.tsv generated from the edgelist.tsv file.

0	locatedin

Content of the relations_map.tsv generated from the edgelist.tsv file.

1	0	0
6	0	0
2	0	0
4	0	0
5	0	0
3	0	0

Content of the mapped_edgelist.tsv generated from the edgelist.tsv file.

CLI Usage

The CLI takes the following positional arguments:

  edgelist    Path of the edgelist file
  output      Path of the output directory

Example usage:

pip install git+https://github.com/simonepri/edgelist-mapper
python -m edgelist_mapper.bin.run \
    edgelist.tsv \
    .

NB: You need Python 3 to run the CLI.

Showcase

This tool has been used to create this collection of datasets.

Authors

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the license file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edgelist_mapper-0.1.1.tar.gz (7.6 kB view hashes)

Uploaded Source

Built Distribution

edgelist_mapper-0.1.1-py3-none-any.whl (8.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page