Skip to main content

An implementation of Karger's Algorithm to find similar Genesets

Project description

roux-algo-geneweaver-kargers

WARNING: This package is under active development. The API may change at any time. Documentation may be out of date.

Installation

Set Up

It's usually a good idea to install this package in a python virtual environment.

python3 -m venv $NAME_OF_VENV
source $NAME_OF_VENV/bin/activate
pip install roux-algo-geneweaver-kargers

With Python Poetry

poetry new $NEW_PROJECT_NAME
cd $NEW_PROJECT_NAME
poetry add roux-also

Usage

Two data formats are currently supported, "Node and Edges List" and "Adjacency List" formats. If you use the "Adjacency List" format, you will need to convert it to the "Node and Edges List" format before using it with Karger's. We included utility methods to make this easy.

Load Data

from roux.algo.geneweaver.kargers import *
nodes, edges = karger_io.load_nodes_edges('nodes_edges_file.json')
Load Data from Adjacency List Source
from roux.algo.geneweaver.kargers import *
graph = karger_io.load_graph('graph_file.json')
nodes, edges = karger_tf.adj_graph_to_edge_list(graph)
edges = karger_tf.deduplicate_edge_list(edges)

Run Kargers on Data

from roux.algo.geneweaver.kargers import *
nodes, edges = karger_io.load_nodes_edges('nodes_edges_file.json')
k_inst = KargerMinCut(nodes, edges)
min_cut, best_cuts, result = k_inst.min_cut()

...

super_nodes = karger_tf.union_find_to_geneset_list(
   result.roots(),
   result.non_roots()
)

Developer Setup

Base Tools

  1. Install Git
  2. Install python3.9
    1. Note: install method varies by operating system
  3. Install poetry

Set up

  1. Clone this repository
  2. Move to cloned directory (e.g. cd cs5800-final-project)
  3. Run poetry install
  4. If you need to connect to the database, create a .env configuration file

You should now be able to use the python package:

  1. Run poetry shell
  2. Run python3
from roux.algo.geneweaver.kargers import kragers_poc as kpm
from roux.algo.geneweaver.kargers.utils import build_graph as bg
from roux.algo.geneweaver.kargers.db.session import SessionLocal

db = SessionLocal()
graph = bg.build_graph(db, {i for i in range(349100, 349110)})
result = kpm.kragers_poc_1(graph)

Create the Tier 2 Dataset

from roux.algo.geneweaver.kargers.utils import build_graph as bg
from roux.algo.geneweaver.kargers.db.session import SessionLocal

db = SessionLocal()
# The dataset is slightly smaller than 19000 nodes
graph = bg.get_adjacency_exclusive_new(db, 2, 19000)

To save the dataset

from roux.algo.geneweaver.kargers.utils import load_save_graph as ls

# Build the graph above
graph = ...

### Get all Tier 2 Genesets
ls.save_graph('filename.json', graph)

Get all Tier 2 Genesets

from roux.algo.geneweaver.kargers.utils import build_graph as bg
from roux.algo.geneweaver.kargers.db.session import SessionLocal

db = SessionLocal()
tier_2_genesets = bg.get_all_genesets_by_tier(db, 2)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

roux-algo-geneweaver-kargers-0.0.3.tar.gz (18.7 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page