Skip to main content

Process BBI (bigWig/bigBed) and HiC files

Project description

Installation

pip install gwseq-io-pp

Requires numpy.

Usage

Open bigWig and bigBed files

reader = gwseq_io.open(path, *, zoom_correction)

Parameters:

  • zoom_correction Scaling factor for automatic zoom level selection based on bin size. Only for bigWig files. 1/3 by default.

Attributes for bigWig and bigBed files:

  • main_header General file formatting info.
  • zoom_headers Zooms levels info (reduction level and location).
  • auto_sql BED entries declaration (only in bigBed).
  • total_summary Statistical summary of entire file values (coverage, sums and extremes).
  • chr_sizes Chromosomes IDs and sizes.
  • type Either "bigwig" or "bigbed".

Read bigWig and bigBed signal

values = reader.read_signal(chr_ids, starts, ends)
values = reader.read_signal(chr_ids, starts=starts, span=span)
values = reader.read_signal(chr_ids, ends=ends, span=span)
values = reader.read_signal(chr_ids, centers=centers, span=span)

Parameters:

  • chr_ids starts ends centers Chromosomes ids, starts, ends and centers of locations. Both starts ends or one of starts ends centers (with span) may be specified.
  • span Reading window in bp relative to locations starts ends centers. Only one reference may be specified if specified. Not by default.
  • bin_size Reading bin size in bp. May vary in output if locations have variable spans or bin_count is specified. 1 by default.
  • bin_count Output bin count. Inferred as max location span / bin size by default.
  • bin_mode Method to aggregate bin values. Either "mean", "sum" or "count". "mean" by default.
  • full_bin Extend locations ends to overlapping bins if true. Not by default.
  • def_value Default value to use when no data overlap a bin. 0 by default.
  • zoom BigWig zoom level to use. Use full data if -1. Auto-detect the best level if -2 by selecting the larger level whose bin size is lower than the third of bin_size (may be the full data). Full data by default.

Returns a numpy float32 array of shape (locations, bin count).

Quantify bigWig and bigBed signal

values = reader.quantify(chr_ids, starts, ends)

Parameters:

  • chr_ids starts ends centers span bin_size full_bin def_value zoom Identical to read_signal method.
  • reduce Method to aggregate values over span. Either "mean", "sd", "sem", "sum", "count", "min" or "max". "mean" by default.

Returns a numpy float32 array of shape (locations).

Profile bigWig and bigBed signal

values = reader.profile(chr_ids, starts, ends)

Parameters:

  • chr_ids starts ends centers span bin_size bin_count bin_mode full_bin def_value zoom Identical to read_signal method.
  • reduce Method to aggregate values over locations. Either "mean", "sd", "sem", "sum", "count", "min" or "max". "mean" by default.

Returns a numpy float32 array of shape (bin count).

Read bigBed entries

values = reader.read_entries(chr_ids, starts, ends)

Parameters:

  • chr_ids starts ends centers spans Identical to read_signal method.

Returns a list (locations) of list of entries (dict with at least "chr", "start" and "end" keys).

Convert bigWig to bedGraph or WIG

reader.to_bedgraph(output_path)
reader.to_wig(output_path)

Parameters:

  • output_path Path to output file.
  • chr_ids Only extract data from these chromosomes. All by default.
  • zoom Zoom level to use. Use full data if -1. Full data by default.

Convert bigBed to BED

reader.to_bed(output_path)

Parameters:

  • output_path chr_ids Identical to to_bedgraph and to_wig methods.
  • col_count Only write this number of columns (eg, 3 for chr, start and end). All by default.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gwseq_io_pp-0.0.1.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gwseq_io_pp-0.0.1-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file gwseq_io_pp-0.0.1.tar.gz.

File metadata

  • Download URL: gwseq_io_pp-0.0.1.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gwseq_io_pp-0.0.1.tar.gz
Algorithm Hash digest
SHA256 7d61485ff879fce348e489fab6b3b0972cc3afdb22d0bf563e12b4674d806c06
MD5 6b61725489ab0d3d2e972da1721a2170
BLAKE2b-256 f056e95519640a940b125d670d8ed16107a87428768cf0ae4aa064116e6e2984

See more details on using hashes here.

File details

Details for the file gwseq_io_pp-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: gwseq_io_pp-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gwseq_io_pp-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8ea4c4edbedbfdc345e1376b5d349d47f4e7da4dbc9f7b105d996ebde7334087
MD5 fca7870e208bdd320f0d13b17f368ff9
BLAKE2b-256 e4edf601aa9a4a093e54c156466de62f8e47bcf01f946ad870c19c9f29167296

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page