Methods for filtering for high-scoring genomic intervals

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

region-selection

Methods for filtering for high-scoring genomic intervals

Usage

Importing the module and creating a Selection instance

>>> from region_selection import Selection
>>> s = Selection()

Specify properties

>>> s.method = "pq"
>>> s.input_fn = "/Users/areynolds/Developer/Github/region_selection/tests/windows.fixed.25k.bed"
>>> s.bin_size = 200
>>> s.exclusion_span = 24800

The method can be one of pq, wis, or maxmean, for selecting from one of priority-queue, weighted interval scheduling, or max-mean window sweep methods, respectively.

The input_fn property points to a file on the file system. This is optional, unless using the read() method.

The bin_size and exclusion_span properties are integers. These represent the size of elements, and the distance required between them (exclusing the bin, itself).

The default values are 200 and 24800, respectively. This means bins are 200 nt wide, and we require at least 25000 nt of distance between any filtered bins.

Input data

You can read in data from a four-column, tab-delimited text file:

>>> in_df = s.read(s, s.method, s.input_fn)
[region_selection] Reading input file into dataframe...
[region_selection] Read dataframe

Otherwise, you must provide a Pandas dataframe containing four columns, each labeled: Chromosome, Start, End, and Score, respecively.

In the above snippet, the input dataframe is called in_df.

Running the selection method

Use run() to run the specified method on the input dataframe in_df (or whatever its name is):

>>> out_df = s.run(s, s.method, in_df)
[region_selection] Bin size (nt): 200
[region_selection] Exclusion span (nt): 24800
[region_selection] Exclusion bins: 124
[region_selection] Method: Priority-Queue (PQ)
[region_selection] Constructing heap
[region_selection] Constructing qualifying bin list from heap
[region_selection] Returning sorted bin list
[region_selection] Method (runtime in sec): 140.50703937999998

The result is stored as a Pandas dataframe. Here it is called out_df and you can call all the usual Pandas properties on this:

>>> print(out_df.head())
    Chromosome   Start     End  Score
47        chr1    9400   34400   0.41
172       chr1   34400   59400   0.41
304       chr1   60800   85800   0.41
429       chr1   85800  110800   0.41
554       chr1  110800  135800   0.41

Or use the write() to write to standard output:

>>> s.write(out_df)
...

Or write with to_csv() etc.

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.1.0

May 17, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

region_selection_apr-0.1.0.tar.gz (8.8 kB view details)

Uploaded May 17, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

region_selection_apr-0.1.0-py3-none-any.whl (9.0 kB view details)

Uploaded May 17, 2022 Python 3

File details

Details for the file region_selection_apr-0.1.0.tar.gz.

File metadata

Download URL: region_selection_apr-0.1.0.tar.gz
Upload date: May 17, 2022
Size: 8.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.0 CPython/3.8.3

File hashes

Hashes for region_selection_apr-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e524774e921f40e423e72a6d9f104d91a008d756fbb32a34fc55c61cb007b376`
MD5	`b3e90ca1081d342d2d13322465f40636`
BLAKE2b-256	`9d04d1996e6ed8967be61e8137c127c8ec5e6e7ca5b33e2c2ee47a50f6feb823`

See more details on using hashes here.

File details

Details for the file region_selection_apr-0.1.0-py3-none-any.whl.

File metadata

Download URL: region_selection_apr-0.1.0-py3-none-any.whl
Upload date: May 17, 2022
Size: 9.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.0 CPython/3.8.3

File hashes

Hashes for region_selection_apr-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bd6677776e223b89a99aa367c0fc71aae2ca9e3b587cc8bf08f5b039022c0bad`
MD5	`0b839702f6ea6d62c70ca580542f78a9`
BLAKE2b-256	`f8588b1c38670d08c87389b98a9a748e10d9d09f8ead28a3476772316dbf49a7`

See more details on using hashes here.

region-selection-apr 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

region-selection

Usage

Importing the module and creating a Selection instance

Specify properties

Input data

Running the selection method

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes