Skip to main content

Process BBI (bigWig/bigBed) and HiC files

Project description

Installation

pip install gwseq-io-pp

Requires numpy.

Usage

Open bigWig and bigBed files

reader = gwseq_io.open(path, *, zoom_correction)

Parameters:

  • zoom_correction Scaling factor for automatic zoom level selection based on bin size. Only for bigWig files. 1/3 by default.

Attributes for bigWig and bigBed files:

  • main_header General file formatting info.
  • zoom_headers Zooms levels info (reduction level and location).
  • auto_sql BED entries declaration (only in bigBed).
  • total_summary Statistical summary of entire file values (coverage, sums and extremes).
  • chr_sizes Chromosomes IDs and sizes.
  • type Either "bigwig" or "bigbed".

Read bigWig and bigBed signal

values = reader.read_signal(chr_ids, starts, ends)
values = reader.read_signal(chr_ids, starts=starts, span=span)
values = reader.read_signal(chr_ids, ends=ends, span=span)
values = reader.read_signal(chr_ids, centers=centers, span=span)

Parameters:

  • chr_ids starts ends centers Chromosomes ids, starts, ends and centers of locations. Both starts ends or one of starts ends centers (with span) may be specified.
  • span Reading window in bp relative to locations starts ends centers. Only one reference may be specified if specified. Not by default.
  • bin_size Reading bin size in bp. May vary in output if locations have variable spans or bin_count is specified. 1 by default.
  • bin_count Output bin count. Inferred as max location span / bin size by default.
  • bin_mode Method to aggregate bin values. Either "mean", "sum" or "count". "mean" by default.
  • full_bin Extend locations ends to overlapping bins if true. Not by default.
  • def_value Default value to use when no data overlap a bin. 0 by default.
  • zoom BigWig zoom level to use. Use full data if -1. Auto-detect the best level if -2 by selecting the larger level whose bin size is lower than the third of bin_size (may be the full data). Full data by default.

Returns a numpy float32 array of shape (locations, bin count).

Quantify bigWig and bigBed signal

values = reader.quantify(chr_ids, starts, ends)

Parameters:

  • chr_ids starts ends centers span bin_size full_bin def_value zoom Identical to read_signal method.
  • reduce Method to aggregate values over span. Either "mean", "sd", "sem", "sum", "count", "min" or "max". "mean" by default.

Returns a numpy float32 array of shape (locations).

Profile bigWig and bigBed signal

values = reader.profile(chr_ids, starts, ends)

Parameters:

  • chr_ids starts ends centers span bin_size bin_count bin_mode full_bin def_value zoom Identical to read_signal method.
  • reduce Method to aggregate values over locations. Either "mean", "sd", "sem", "sum", "count", "min" or "max". "mean" by default.

Returns a numpy float32 array of shape (bin count).

Read bigBed entries

values = reader.read_entries(chr_ids, starts, ends)

Parameters:

  • chr_ids starts ends centers spans Identical to read_signal method.

Returns a list (locations) of list of entries (dict with at least "chr", "start" and "end" keys).

Convert bigWig to bedGraph or WIG

reader.to_bedgraph(output_path)
reader.to_wig(output_path)

Parameters:

  • output_path Path to output file.
  • chr_ids Only extract data from these chromosomes. All by default.
  • zoom Zoom level to use. Use full data if -1. Full data by default.

Convert bigBed to BED

reader.to_bed(output_path)

Parameters:

  • output_path chr_ids Identical to to_bedgraph and to_wig methods.
  • col_count Only write this number of columns (eg, 3 for chr, start and end). All by default.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gwseq_io_pp-0.0.2.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gwseq_io_pp-0.0.2-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file gwseq_io_pp-0.0.2.tar.gz.

File metadata

  • Download URL: gwseq_io_pp-0.0.2.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gwseq_io_pp-0.0.2.tar.gz
Algorithm Hash digest
SHA256 d22f05d9595edb0c2577694098b09d56023c99e16656b589231b329ed6f2a45b
MD5 4944dd0e5d10ae01aaf59f2c5b97e636
BLAKE2b-256 17ed5a5fd18347e1ccfc148e72c7a1b7eb79abe9fbdca110142e3b9827cf4c11

See more details on using hashes here.

File details

Details for the file gwseq_io_pp-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: gwseq_io_pp-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gwseq_io_pp-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dd3ef68aa2dc272c76e31e603f72b261b32ff309cb1379c7068bd2e41ecd8a65
MD5 33228ff4fa5eb796f4fbd152c617cb30
BLAKE2b-256 eba9adcde1219d6e44daf039dffaebf85eb49f7d71546d432c025ad39ef1967f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page