Skip to main content

Maps nodes and edges of a multi-relational graph to integer

Project description

edgelist-mapper

Build status Build status
Code style Linter Test runner Build tool
Project license

📊 Maps nodes and edges of a multi-relational graph to integer

Synopsis

edgelist-mapper is a simple tool that reads an edge-list file representing a graph and maps each node and relation to integer. The mapping assigned is such that entities and relations that appear more frequently in the graph are mapped to smaller numerical values.

This tool is particularly useful to pre-process some of the publicly available knowledge graph datasets that are often used for the machine learning task of relation prediction.

Input format

The tool takes as input a file (edgelist.tsv) that represents a graph as tab-separated triples of the form (head, relation, tail) and generates three new files, namely mapped_edgelist.tsv, entities_map.tsv, and relations_map.tsv.

san_marino	locatedin	europe
belgium	locatedin	europe
russia	locatedin	europe
monaco	locatedin	europe
croatia	locatedin	europe
poland	locatedin	europe

Example content of the edgelist.tsv file.

0	europe
1	san_marino
2	russia
3	poland
4	monaco
5	croatia
6	belgium

Content of the entities_map.tsv generated from the edgelist.tsv file.

0	locatedin

Content of the relations_map.tsv generated from the edgelist.tsv file.

1	0	0
6	0	0
2	0	0
4	0	0
5	0	0
3	0	0

Content of the mapped_edgelist.tsv generated from the edgelist.tsv file.

CLI Usage

The CLI takes the following positional arguments:

  edgelist    Path of the edgelist file
  output      Path of the output directory

Example usage:

pip install edgelist-mapper
python -m edgelist_mapper.bin.run \
    edgelist.tsv \
    .

NB: You need Python 3 to run the CLI.

Showcase

This tool has been used to create this collection of datasets.

Authors

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the license file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edgelist-mapper-0.1.2.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

edgelist_mapper-0.1.2-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file edgelist-mapper-0.1.2.tar.gz.

File metadata

  • Download URL: edgelist-mapper-0.1.2.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.7.7 Darwin/19.3.0

File hashes

Hashes for edgelist-mapper-0.1.2.tar.gz
Algorithm Hash digest
SHA256 dbb574a3d9417f120f9c1cd95e6d39693f4e4aaffe56fa7484b322f5151d3392
MD5 9dd744e458118274661b7c63e9b260ec
BLAKE2b-256 b061de618953ec047c81c62419d0764c266d836928afd826a16e578979acc785

See more details on using hashes here.

File details

Details for the file edgelist_mapper-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: edgelist_mapper-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.7.7 Darwin/19.3.0

File hashes

Hashes for edgelist_mapper-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 64370b2d79a36c36e303b26525c763175bf19a6c06e2423c72a79a3a6a2e375f
MD5 5b96819767e4f32690eb070548acd3af
BLAKE2b-256 2ce75304c892ef0ff8225ad9bd7f705a56e16768a3955d5724eea21419ea34e0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page