Skip to main content

A package for adaptive weighted similarity network fusion

Project description

Adaptive Weighted Network Fusion (AWNF): A flexible framework for integrating multi-modality.

AWNF is a Python package designed to implement Similarity Network Fusion (SNF) with an adaptive weighting mechanism. This package is intended for use in computational biology, machine learning, and data science tasks that involve multi-view data, such as genomics, imaging, and clinical data.

The package helps integrate heterogeneous data sources into a single, unified similarity network, which can be used for predictive modeling and analysis.


Installation

To install the awnf package, you can use pip.

Install via PyPI (if published)

pip install awnf

Project links

The source code is available on GitHub.

Usage

Once installed, you can use the awnf package by importing its functions into your Python scripts. Below is an example usage:

Example

# Import necessary functions from the 'weighted_snf' package and other libraries
from awnf import feature_selection, make_affinity_with_weight, SNF_modality_weights, process_feature_weights_and_mad
import pandas as pd
import numpy as np
from sklearn.datasets import make_classification

# Example input data for affinity matrix calculation
# First, generate a synthetic dataset to demonstrate feature selection and affinity matrix calculation

# Generate a classification dataset with 100 samples, 50 features, and 2 informative features
# The dataset will have 2 classes (target variable)
X, y = make_classification(n_samples=100, n_features=50, n_informative=2, n_classes=2, random_state=42)

# Convert the data into a pandas DataFrame for easier manipulation and inspection
X_df1 = pd.DataFrame(X, columns=[f'Feat_mod1_{i+1}' for i in range(X.shape[1])])
y_df1 = pd.DataFrame(y, columns=['Target'])

# Generate another classification dataset with a different set of features
X, y = make_classification(n_samples=100, n_features=60, n_informative=2, n_classes=2, random_state=42)

# Convert the second dataset to a pandas DataFrame for ease of manipulation
X_df2 = pd.DataFrame(X, columns=[f'Feat_mod2_{i+1}' for i in range(X.shape[1])])
y_df2 = pd.DataFrame(y, columns=['Target'])

# Display the first few rows of the first dummy dataset (X_df1 and y_df1)
print(X_df1.head())
print(y_df1.head())

# Perform feature selection using the Boruta algorithm for the first dataset (X_df1)
num_features = 10  # Specify the number of features to select
selected_genes1, feature_ranks1 = feature_selection(X_df1, np.ravel(y_df1), num_features=num_features, n_estimators=100)

# Display the feature ranks for the first dataset
print(feature_ranks1)

# Perform feature selection on the second dataset (X_df2)
num_features = 18  # Specify the number of features to select
selected_genes2, feature_ranks2 = feature_selection(X_df2, np.ravel(y_df2), num_features=num_features, n_estimators=50)

# Display the feature ranks for the second dataset
print(feature_ranks2)

# Update the feature sets for both datasets by selecting only the top-ranked features
X_df1 = X_df1[selected_genes1]
X_df2 = X_df2[selected_genes2]

# Process the feature weights and calculate the feature importance
# The 'process_feature_weights_and_mad' function is assumed to calculate weights based on feature ranks
sorted_weights = process_feature_weights_and_mad(
    X_v2_list=[X_df1, X_df2],  # List of feature datasets
    feature_ranks_list=[feature_ranks1, feature_ranks2],  # Corresponding feature ranks
    betta=0.5,  # A parameter controlling the weight scaling (assumed)
)

# Display the first set of weights (for the first dataset)
print(sorted_weights[0])

# Generate the similarity (affinity) matrices for each dataset using the feature weights
similarity_view1_w = make_affinity_with_weight(X_df1, weight=sorted_weights[0]['feature_weight'].to_list())
similarity_view2_w = make_affinity_with_weight(X_df2, weight=sorted_weights[1]['feature_weight'].to_list())

# Combine the similarity matrices from both datasets using weighted SNF (Similarity Network Fusion)
# We are assigning different weights to the modalities (views) based on their importance
fused_network = SNF_modality_weights([similarity_view1_w, similarity_view2_w], weight_modality=[0.8, 0.2])

# Print the resulting fused network, which combines information from both datasets
print('fused_network', fused_network)

AWNF is developed based on the following code repositories:

  1. SNFpy
  2. boruta_py

Authors

Sevinj Yolchuyeva, Venkata Manem
Email: sevinj.yolchuyeva@crchudequebec.ulaval.ca, venkata.manem@crchudequebec.ulaval.ca

Citation

Will published soon...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

awnf-0.1.3.tar.gz (15.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

awnf-0.1.3-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file awnf-0.1.3.tar.gz.

File metadata

  • Download URL: awnf-0.1.3.tar.gz
  • Upload date:
  • Size: 15.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.13

File hashes

Hashes for awnf-0.1.3.tar.gz
Algorithm Hash digest
SHA256 8ce7f315d48596a5169e94797773da0e9064d63602b4383053a6273032bc91d5
MD5 0d67a2d313751b1ac5dfbef3b34d87f0
BLAKE2b-256 5ead2d3c70a838ba3b27cb549e3c9fb646cc5c399fb8c437fbf3f98711b271ad

See more details on using hashes here.

File details

Details for the file awnf-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: awnf-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 13.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.13

File hashes

Hashes for awnf-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b642fda05a5594d2565a0e3578fada5c355283066c04538b6302e0aa8071ccce
MD5 83c49d4c32042541481d2d229d5a63e6
BLAKE2b-256 6d24f3a2541dc2ac827551463cee925e5c04c65d1a15337dbf45d237bbd9567c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page