Implementation of the HeNHoE-2vec algorithm by Valentini et al. (2021).
Project description
HeNHoE-2vec
A Python implementation of the HeNHoE-2vec algorithm by Valentini et al. for the embedding of networks with heterogeneous nodes and homogeneous edges (HeNHoE).
Note: HeNHoE networks are analogous to multilayer networks: in HeNHoE networks, each node has a distinct node type, and in multilayer networks, each node belongs to a distinct layer. The terms type and layer may therefore be regarded synonymous. Throughout the code and for the remainder of this documentation, we will use the terms multilayer network and layer as opposed to HeNHoE network and type.
Installation
Install the package from PyPI by running the following command:
$ pip install henhoe2vec
Alternatively, clone this repository by running
$ git clone git@github.com:Bertr0/HeNHoE-2vec.git
and then install the package by running pip install . from the root of the repository.
Usage
This package may be used as a Python script or as a package, allowing its modules to be imported by other Python projects. Both forms of use make it easy to run HeNHoE-2vec on multilayer networks.
As a Package
After installing the package using pip, its modules may be imported using
import henhoe2vec
The many individual steps of HeNHoE-2vec are accumulated in a single run() method in the henhoe2vec.henhoe2vec module. HeNHoE-2vec can be run from start to finish as follows:
import henho2vec as hh2v
hh2v.henhoe2vec.run(input_csv, output_dir)
input_csv is the path to the multilayer edge list of the network to be embedded (csv file with no index). output_dir is the path to the output directory where the embedding files will be saved. The run() method takes a bunch of other optional parameters which can be used to configure HeNHoE-2vec. A comprehensive overview of parameters can be found in the code documentation.
As a Python Script
To run HeNHoE-2vec as a script, clone this repository using
$ git clone git@github.com:Bertr0/HeNHoE-2vec.git
, install the requirements found in requirements.txt and run the following command from the root of the repository:
$ python3 -m src.henhoe2vec --input <input_path> --output_dir <output_dir_path>
This will generate node embeddings for the nodes of the network specified by the multilayer edge list saved at <input_path> and saves the embedding files in <output_dir>.
Run python3 -m src.henhoe2vec --help from the root of the repository to show an overview of all arguments taken by the script. The following table also shows an overview of all arguments:
Script Arguments
| Argument | Type | Description | Default Value |
|---|---|---|---|
--input |
str | Path to the multilayer edge list of the network to be embedded (csv file with no index). | - |
--sep |
str | Delimiter of the input csv edge list. | "\t" |
--header |
store_true | Pass this argument if the input csv edge list has a header. | - |
--output_name |
str | Name of the output .csv file (without suffix). | "embeddings" |
--is_directed |
store_true | Pass this argument if the network is directed. | - |
--edges_are_distances |
store_true | Pass this argument if edge weights indicate distance between nodes (opposed to weight/similarity). | - |
--output_dir |
str | Path of the output directory where the embedding files will be saved. | - |
--dimensions |
int | The dimensionality of the embeddings. | 128 |
--walk_length |
int | Length of each random walk. | 20 |
--num_walks |
int | Number of random walks to simulate for each node. | 10 |
--p |
float | Return parameter p from the node2vec algorithm. |
1.0 |
--q |
float | In-out parameter q from the node2vec algorithm. |
0.5 |
--s |
float | Default switching parameter for layer pairs which are not specified in the --s-dict argument. |
1.0 |
--s_dict |
list | Switching parameters for specific layer pairs in a dict-like manner. Pass the names of layer pairs followed by their switching parameters, separated by white spaces. E.g., if the switching parameter from layer1 to layer2 is 0.5 and the switching parameter from layer2 to layer1 is 0.7, you would pass layer1 layer2 0.5 layer2 layer1 0.7. Note that layer pairs are directed. For all layer pairs which are not specified here, the default parameter --s is adopted. |
empty list |
--window_size |
int | Context size for the word2vec optimization. | 10 |
--epochs |
int | Number of epochs in SGD. | 1 |
--workers |
int | Number of parallel workers (threads). | 8 |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file henhoe2vec-1.0.4.tar.gz.
File metadata
- Download URL: henhoe2vec-1.0.4.tar.gz
- Upload date:
- Size: 17.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60f4b3aee8bd7b82e63463378e610826dee312ac67f28885c6604f4494fe8ada
|
|
| MD5 |
e3a8e465f747d9a72aff94b35f7e3a8c
|
|
| BLAKE2b-256 |
001d72206c78730d3ac77ccbbffe9f43197fdf020dc3b344646bd7f6844134b4
|
File details
Details for the file henhoe2vec-1.0.4-py3-none-any.whl.
File metadata
- Download URL: henhoe2vec-1.0.4-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2cd46441216617a96b9fa00036f2d844ed94078e85b7964db239f8e0732c0945
|
|
| MD5 |
ec1dd28f9a4b86615ebf9ca41def1ca2
|
|
| BLAKE2b-256 |
55ddd3b8e96c05c36357807c58fd2f2b727e097a9f65ac4c1aba6ea969bd6524
|