Skip to main content

Local graph estimation with pathwise feature selection

Project description

Local graph estimation with pathwise feature selection

Local graph estimation is a framework for discovering local graph/network structure around specific variables of interest. Pathwise feature selection (PFS) is an algorithm for performing local graph estimation with pathwise false discovery control.

Associated paper

  • Local graph estimation: Interpretable network discovery for complex data
    In preparation

Installation

pip install localgraph

Usage

from localgraph import pfs, plot_graph

# Load n-by-p data matrix X (n samples, p features)

# Specify the target features (list of indices)
target_features = [0, 1]

# Specify the pathwise q-value threshold
qpath_max = 0.2

# Optional: specify the maximum radius of the local graph (default is 3)
max_radius = 3

# Optional: specify the neighborhood FDR thresholds for nodes in each radius
fdr_local = [0.2, 0.1, 0.1]

# Run PFS
Q = pfs(X, target_features, qpath_max=qpath_max, max_radius=max_radius, fdr_local=fdr_local)

# Plot the estimated subgraph
plot_graph(graph=Q, target_features=target_features, radius=max_radius)

Outputs

  • Q: Dictionary mapping edges (i,j) to q-values. Edges are undirected, so (i,j) and (j,i) are included.

What PFS does

  • Expands the local graph outward, layer by layer, starting from target variables.
  • Performs neighborhood selection with FDR control using integrated path stability selection.
  • Controls pathwise false discoveries by summing q-values along candidate paths.

Full list of pfs arguments

Required arguments:

  • X: n-by-p data matrix (NumPy array). Each column is a feature/variable.
  • target_features: Feature index or list of indices to center the graph around.
  • qpath_max: Maximum allowed sum of q-values along any path.

Optional arguments:

  • max_radius: Maximum number of expansion layers around each target (int; default 3).
  • fdr_local: Neighborhood FDR threshold at each radius (list of length max_radius; default [qpath_max]*max_radius).
  • custom_nbhd: Dictionary specifying custom FDR cutoffs for certain features (dict; default None).
  • feature_names: List of feature names; required if custom_nbhd is provided (list of strings).
  • criterion: Rule for resolving multiple edges (default 'min').
  • qvalue_method: A method for computing q-values (function)
  • method_args: Dictionary of arguments to pass to qvalue_method (dict; default None)
  • verbose: Whether to print progress during selection (bool; default False)

Graph plotting

Use plot_graph to visualize a local graph up to the specified radius around one or more target features.

from localgraph import plot_graph

# Plot local graph around target_features using output Q from pfs
plot_graph(graph=Q, target_features=target_features, radius=3)

Features and customization

plot_graph visualizes a local graph of a user-specified radius around one or more target features. It supports:

  • Flexible input formats: edge dictionary, adjacency matrix, or NetworkX graph
  • Automatic subgraph extraction around the targets
  • Node coloring by distance from the target(s) (default), or user-specified colors (e.g., by variable type)
  • Several layout algorithms ('kamada_kawai', 'spring', 'circular', etc.)
  • Customizable node size, font sizes, and edge thickness
  • Optional display of q-values; edge widths can reflect q-value strength (edge_widths='q_value')
  • False positives shown in red if the true graph is provided
  • Integration with custom plots via ax or pos
  • Optional saving of figures (save_fig) and graphs (save_graph)

For a full list of arguments, see the plot_graph docstring.

Returns

The function returns a dictionary containing:

  • feature_radius_list: List of (feature name, radius) pairs for all nodes in the graph.
  • graph: The NetworkX subgraph used for plotting.
  • positions: Dictionary of node coordinates.
  • figure: The matplotlib figure object (only if the function creates the figure).

Further customization

To manually adjust node positions for publication-quality figures, you can export graphs to Gephi, edit them interactively, and re-import the updated layout into Python. See: gephi_instructions.md for a full walkthrough.

Examples

The examples/ folder contains scripts that demonstrate end-to-end usage:

  • simple_example.py: Simulate data, run PFS, and visualize the result.

Evaluation tools

The evaluation/ folder contains helper functions for measuring subgraph recovery in simulation settings.

  • The eval.py script contains one function:
    • tp_and_fp: Count true and false positives compared to ground truth

These are useful for benchmarking PFS and other graph estimation methods when the true graph is known.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

localgraph-0.1.4.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

localgraph-0.1.4-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file localgraph-0.1.4.tar.gz.

File metadata

  • Download URL: localgraph-0.1.4.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for localgraph-0.1.4.tar.gz
Algorithm Hash digest
SHA256 5a5ebd301a7136aab63f00a5e97db338323613c40251695cf0a3284abb77a15a
MD5 720de19a5f7806dceb7e34d9864ba0c1
BLAKE2b-256 11dd884c1450d4b291544f0876bfa9472fd8022e48fcfc0f23dc25064e5ac1e6

See more details on using hashes here.

File details

Details for the file localgraph-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: localgraph-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for localgraph-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1ed42ff7a109e02be21e1ba8c12123b4ea3aaba1f1095789a70368cb7b994686
MD5 ba95d5b66dc1bedaa5721398189f790e
BLAKE2b-256 1cddeccde8275e06ff21d95ca0b81a21095a478bedc21e089dda9451105b85d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page