A package for adaptive weighted similarity network fusion
Project description
Adaptive Weighted Network Fusion (AWNF): A flexible framework for integrating multi-modality.
AWNF is a Python package designed to implement Similarity Network Fusion (SNF) with an adaptive weighting mechanism. This package is intended for use in computational biology, machine learning, and data science tasks that involve multi-view data, such as genomics, imaging, and clinical data.
The package helps integrate heterogeneous data sources into a single, unified similarity network, which can be used for predictive modeling and analysis.
Installation
To install the awnf package, you can use pip.
Install via PyPI (if published)
pip install awnf
Project links
The source code is available on GitHub.
Usage
Once installed, you can use the awnf package by importing its functions into your Python scripts. Below is an example usage:
Example
# Import necessary functions from the 'weighted_snf' package and other libraries
from awnf import feature_selection, make_affinity_with_weight, SNF_modality_weights, process_feature_weights_and_mad
import pandas as pd
import numpy as np
from sklearn.datasets import make_classification
# Example input data for affinity matrix calculation
# First, generate a synthetic dataset to demonstrate feature selection and affinity matrix calculation
# Generate a classification dataset with 100 samples, 50 features, and 2 informative features
# The dataset will have 2 classes (target variable)
X, y = make_classification(n_samples=100, n_features=50, n_informative=2, n_classes=2, random_state=42)
# Convert the data into a pandas DataFrame for easier manipulation and inspection
X_df1 = pd.DataFrame(X, columns=[f'Feat_mod1_{i+1}' for i in range(X.shape[1])])
y_df1 = pd.DataFrame(y, columns=['Target'])
# Generate another classification dataset with a different set of features
X, y = make_classification(n_samples=100, n_features=60, n_informative=2, n_classes=2, random_state=42)
# Convert the second dataset to a pandas DataFrame for ease of manipulation
X_df2 = pd.DataFrame(X, columns=[f'Feat_mod2_{i+1}' for i in range(X.shape[1])])
y_df2 = pd.DataFrame(y, columns=['Target'])
# Display the first few rows of the first dummy dataset (X_df1 and y_df1)
print(X_df1.head())
print(y_df1.head())
# Perform feature selection using the Boruta algorithm for the first dataset (X_df1)
num_features = 10 # Specify the number of features to select
selected_genes1, feature_ranks1 = feature_selection(X_df1, np.ravel(y_df1), num_features=num_features, n_estimators=100)
# Display the feature ranks for the first dataset
print(feature_ranks1)
# Perform feature selection on the second dataset (X_df2)
num_features = 18 # Specify the number of features to select
selected_genes2, feature_ranks2 = feature_selection(X_df2, np.ravel(y_df2), num_features=num_features, n_estimators=50)
# Display the feature ranks for the second dataset
print(feature_ranks2)
# Update the feature sets for both datasets by selecting only the top-ranked features
X_df1 = X_df1[selected_genes1]
X_df2 = X_df2[selected_genes2]
# Process the feature weights and calculate the feature importance
# The 'process_feature_weights_and_mad' function is assumed to calculate weights based on feature ranks
sorted_weights = process_feature_weights_and_mad(
X_v2_list=[X_df1, X_df2], # List of feature datasets
feature_ranks_list=[feature_ranks1, feature_ranks2], # Corresponding feature ranks
betta=0.5, # A parameter controlling the weight scaling (assumed)
)
# Display the first set of weights (for the first dataset)
print(sorted_weights[0])
# Generate the similarity (affinity) matrices for each dataset using the feature weights
similarity_view1_w = make_affinity_with_weight(X_df1, weight=sorted_weights[0]['feature_weight'].to_list())
similarity_view2_w = make_affinity_with_weight(X_df2, weight=sorted_weights[1]['feature_weight'].to_list())
# Combine the similarity matrices from both datasets using weighted SNF (Similarity Network Fusion)
# We are assigning different weights to the modalities (views) based on their importance
fused_network = SNF_modality_weights([similarity_view1_w, similarity_view2_w], weight_modality=[0.8, 0.2])
# Print the resulting fused network, which combines information from both datasets
print('fused_network', fused_network)
AWNF is developed based on the following code repositories:
Authors
Sevinj Yolchuyeva, Venkata Manem
Email: sevinj.yolchuyeva@crchudequebec.ulaval.ca, venkata.manem@crchudequebec.ulaval.ca
Citation
Will published soon...
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file awnf-0.1.3.tar.gz.
File metadata
- Download URL: awnf-0.1.3.tar.gz
- Upload date:
- Size: 15.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ce7f315d48596a5169e94797773da0e9064d63602b4383053a6273032bc91d5
|
|
| MD5 |
0d67a2d313751b1ac5dfbef3b34d87f0
|
|
| BLAKE2b-256 |
5ead2d3c70a838ba3b27cb549e3c9fb646cc5c399fb8c437fbf3f98711b271ad
|
File details
Details for the file awnf-0.1.3-py3-none-any.whl.
File metadata
- Download URL: awnf-0.1.3-py3-none-any.whl
- Upload date:
- Size: 13.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b642fda05a5594d2565a0e3578fada5c355283066c04538b6302e0aa8071ccce
|
|
| MD5 |
83c49d4c32042541481d2d229d5a63e6
|
|
| BLAKE2b-256 |
6d24f3a2541dc2ac827551463cee925e5c04c65d1a15337dbf45d237bbd9567c
|