Skip to main content

Process BBI (bigWig/bigBed) and HiC files

Project description

Installation

pip install gwseq-io-pp

Requires numpy.

Usage

Open bigWig and bigBed files

reader = gwseq_io.open(path, *, zoom_correction)

Parameters:

  • zoom_correction Scaling factor for automatic zoom level selection based on bin size. Only for bigWig files. 1/3 by default.

Attributes for bigWig and bigBed files:

  • main_header General file formatting info.
  • zoom_headers Zooms levels info (reduction level and location).
  • auto_sql BED entries declaration (only in bigBed).
  • total_summary Statistical summary of entire file values (coverage, sums and extremes).
  • chr_sizes Chromosomes IDs and sizes.
  • type Either "bigwig" or "bigbed".

Read bigWig and bigBed signal

values = reader.read_signal(chr_ids, starts, ends)
values = reader.read_signal(chr_ids, starts=starts, span=span)
values = reader.read_signal(chr_ids, ends=ends, span=span)
values = reader.read_signal(chr_ids, centers=centers, span=span)

Parameters:

  • chr_ids starts ends centers Chromosomes ids, starts, ends and centers of locations. Both starts ends or one of starts ends centers (with span) may be specified.
  • span Reading window in bp relative to locations starts ends centers. Only one reference may be specified if specified. Not by default.
  • bin_size Reading bin size in bp. May vary in output if locations have variable spans or bin_count is specified. 1 by default.
  • bin_count Output bin count. Inferred as max location span / bin size by default.
  • bin_mode Method to aggregate bin values. Either "mean", "sum" or "count". "mean" by default.
  • full_bin Extend locations ends to overlapping bins if true. Not by default.
  • def_value Default value to use when no data overlap a bin. 0 by default.
  • zoom BigWig zoom level to use. Use full data if -1. Auto-detect the best level if -2 by selecting the larger level whose bin size is lower than the third of bin_size (may be the full data). Full data by default.

Returns a numpy float32 array of shape (locations, bin count).

Quantify bigWig and bigBed signal

values = reader.quantify(chr_ids, starts, ends)

Parameters:

  • chr_ids starts ends centers span bin_size full_bin def_value zoom Identical to read_signal method.
  • reduce Method to aggregate values over span. Either "mean", "sd", "sem", "sum", "count", "min" or "max". "mean" by default.

Returns a numpy float32 array of shape (locations).

Profile bigWig and bigBed signal

values = reader.profile(chr_ids, starts, ends)

Parameters:

  • chr_ids starts ends centers span bin_size bin_count bin_mode full_bin def_value zoom Identical to read_signal method.
  • reduce Method to aggregate values over locations. Either "mean", "sd", "sem", "sum", "count", "min" or "max". "mean" by default.

Returns a numpy float32 array of shape (bin count).

Read bigBed entries

values = reader.read_entries(chr_ids, starts, ends)

Parameters:

  • chr_ids starts ends centers spans Identical to read_signal method.

Returns a list (locations) of list of entries (dict with at least "chr", "start" and "end" keys).

Convert bigWig to bedGraph or WIG

reader.to_bedgraph(output_path)
reader.to_wig(output_path)

Parameters:

  • output_path Path to output file.
  • chr_ids Only extract data from these chromosomes. All by default.
  • zoom Zoom level to use. Use full data if -1. Full data by default.

Convert bigBed to BED

reader.to_bed(output_path)

Parameters:

  • output_path chr_ids Identical to to_bedgraph and to_wig methods.
  • col_count Only write this number of columns (eg, 3 for chr, start and end). All by default.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gwseq_io_pp-0.0.3.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gwseq_io_pp-0.0.3-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file gwseq_io_pp-0.0.3.tar.gz.

File metadata

  • Download URL: gwseq_io_pp-0.0.3.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gwseq_io_pp-0.0.3.tar.gz
Algorithm Hash digest
SHA256 9bfe7d4170950c4145fb27e2e5c0b83edc39f4c149a45f4e9a142f338e9fea0c
MD5 789c98e1c9bc87b00d5497c3059fd031
BLAKE2b-256 3c2d5509412d6d3aa97deb16ab8514416c0c79906c1ddb7fa8e050c86050cea7

See more details on using hashes here.

File details

Details for the file gwseq_io_pp-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: gwseq_io_pp-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gwseq_io_pp-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b41866714a548d2c3e2ede74bbb61a66eea2f080586ea0facd191170b4960033
MD5 4a25fc0b44a9c9fc4a4841ea47627037
BLAKE2b-256 f1a8c4c8d2cb8b4e1a5419016a172c82023d69703283c3fe01b947633a11f7eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page