gwseq-io

Process BBI (bigWig/bigBed) and HiC files

Project description

Installation

pip install gwseq-io

Requires numpy and pybind11.

Usage

Open bigWig, bigBed and HiC files

reader = gwseq_io.open(path)

Attributes for bigWig and bigBed files:

main_header General file formatting info.
zoom_headers Zooms levels info (reduction level and location).
auto_sql BED entries declaration (only in bigBed).
total_summary Statistical summary of entire file values (coverage, sums and extremes).
chr_sizes Chromosomes IDs and sizes.
type Either "bigwig" or "bigbed".

Attributes for HiC files:

header footer General file info.
chr_sizes Chromosomes IDs and sizes.
normalizations Available normalizations.
units Available units.
bin_sizes Available bin sizes.

Read bigWig and bigBed signal

values = reader.read_signal(chr_ids, starts, ends)
values = reader.read_signal(chr_ids, starts=starts, span=span)
values = reader.read_signal(chr_ids, ends=ends, span=span)
values = reader.read_signal(chr_ids, centers=centers, span=span)

Parameters:

chr_ids starts ends centers Chromosomes ids, starts, ends and centers of locations. Both starts ends or one of starts ends centers (with span) may be specified.
span Reading window in bp relative to locations starts ends centers. Only one reference may be specified if specified. Not by default.
bin_size Reading bin size in bp. May varies in output if locations have variable spans or bin_count is specified. 1 by default.
bin_count Output bin count. Inferred as max location span / bin size by default.
bin_mode Method to aggregate bin values. Either "mean", "sum" or "count". "mean" by default.
full_bin Extend locations ends to overlapping bins if true. Not by default.
def_value Default value to use when no data overlap a bin. 0 by default.
zoom BigWig zoom level to use. Use full data if -1. Auto-detect the best level if 0 by selecting the larger level whose bin size is lower that the third of bin_size (may be the full data). Full data by default.
progress Function called during data extraction. Takes the extracted coverage and the total coverage in bp as parameters. Use default callback function if true. None by default.

Returns a numpy float32 array of shape (locations, bin count).

Quantify bigWig and bigBed signal

values = reader.quantify(chr_ids, starts, ends)

Parameters:

chr_ids starts ends centers span bin_size full_bin def_value zoom progress Identical to read_signal method.
reduce Method to aggregate values over span. Either "mean", "sd", "sem", "sum", "count", "min" or "max". "mean" by default.

Returns a numpy float32 array of shape (locations).

Profile bigWig and bigBed signal

values = reader.profile(chr_ids, starts, ends)

Parameters:

chr_ids starts ends centers span bin_size bin_count bin_mode full_bin def_value zoom progress Identical to read_signal method.
reduce Method to aggregate values over locations. Either "mean", "sd", "sem", "sum", "count", "min" or "max". "mean" by default.

Returns a numpy float32 array of shape (bin count).

Read bigBed entries

values = reader.read_entries(chr_ids, starts, ends)

Parameters:

chr_ids starts ends centers spans progress Identical to read_signal method.

Returns a list (locations) of list of entries (dict with at least "chr", "start" and "end" keys).

Convert bigWig to bedGraph or WIG

reader.to_bedgraph(output_path)
reader.to_wig(output_path)

Parameters:

output_path Path to output file.
chr_ids Only extract data from these chromomes. All by default.
zoom Zoom level to use. Full data by default.
progress Function called during data extraction. Takes the extracted coverage and the total coverage in bp as parameters. None by default.

Convert bigBed to BED

reader.to_bed(output_path)

Parameters:

output_path chr_ids progress Identical to to_bedgraph and to_wig methods.
col_count Only write this number of columns (eg, 3 for chr, start and end). All by default.

Write bigWig file

writer = bigwig_io.open(path, "w")
writer = bigwig_io.open(path, "w", def_value=0)
writer = bigwig_io.open(path, "w", chr_sizes={"chr1": 1234, "chr2": 1234})
writer.add_entry("chr1", start=1000, end=1010, value=0.1)
writer.add_value("chr1", start=1000, span=10, value=0.1)
writer.add_values("chr1", start=1000, span=10, values=[0.1, 0.1, 0.1, 0.1])

must be pooled by chr, and sorted by (1) start (2) end no overlap

Write bigBed file

writer = bigwig_io.open(path, "w", type="bigbed")
writer = bigwig_io.open(path, "w", type="bigbed", chr_sizes={"chr1": 1234, "chr2": 1234})
writer = bigwig_io.open(path, "w", type="bigbed", fields=["chr", "start", "end", "name"])
writer = bigwig_io.open(path, "w", type="bigbed", fields={"chr": "string", "start", "uint", "end": "uint", "name": "string"})
writer.add_entry("chr1", start=1000, end=1010)
writer.add_entry("chr1", start=1000, end=1010, fields={"name": "read#1"})

must be pooled by chr, and sorted by (1) start (2) end may be overlapping

Read HiC signal

values = reader.read_signal(chr_ids, starts, ends)
values = reader.read_signal(chr_ids, starts=starts, span=span)
values = reader.read_signal(chr_ids, ends=ends, span=span)
values = reader.read_signal(chr_ids, centers=centers, span=span)

Parameters:

chr_ids starts ends centers Chromosomes ids, starts, ends and centers of the 2 locations. Both starts ends or one of starts ends centers (with span) may be specified.
span Reading window in bp relative to locations starts ends centers. Only one reference may be specified if specified. Not by default.
bin_size Reading bin size in bp. May varies in output if locations have variable spans or bin_count is specified. 1 by default.
bin_count Output bin count. Inferred as max location span / bin size by default.
bin_mode Method to aggregate bin values. Either "mean", "sum" or "count". "mean" by default.
full_bin Extend locations ends to overlapping bins if true. Not by default.
def_value Default value to use when no data overlap a bin. 0 by default.
zoom BigWig zoom level to use. Use full data if -1. Auto-detect the best level if 0 by selecting the larger level whose bin size is lower that the third of bin_size (may be the full data). Full data by default.
progress Function called during data extraction. Takes the extracted coverage and the total coverage in bp as parameters. Use default callback function if true. None by default.

Returns a numpy float32 array of shape (locations, bin count).

reader = hic_io.Reader(path)
values = reader.read_signal(chr_ids, starts, ends)

Parameters:

chr_ids starts ends Chromosomes ids, starts and ends of the 2 locations.
bin_size Input bin size or -1 to use the smallest. Must be available in the file. Smallest by default.
bin_count Max output bin count. Takes precedence over bin_size if specified by selecting the smallest bin size so that output width and height are not larger that bin_count. Not specified by default.
full_bin Extend locations ends to overlapping bins if true. Not by default.
def_value Default value to use when no data overlap a bin. 0 by default.
mode Either "observed" or "oe" (observed/expected). "observed" by default.
normalization Either "none" or any normalization available in the file, such as "kr", "vc" or "vc_sqrt". "none" by default.
unit Either "bp" or "frag". "bp" by default.
triangle Skip symmetrical data if true. Not by default.
max_distance Max contact size in bp to report. All if -1. All by default.

Outputs a numpy float32 array of shape (location 1 span//bin_size, location 2 span//bin_size).

Read sparse signal

reader = hic_io.Reader(path)
values = reader.read_sparse_signal(chr_ids, starts, ends)

Parameters:

chr_ids starts ends bin_size bin_count bin_count full_bin mode normalization unit triangle max_distance Identical to read_signal method.

Returns a COO sparse matrix as a dict with keys:

values Values as a numpy float32 array.
row Values rows indices as a numpy uint32 array.
col Values columns indices as a numpy uint32 array.
shape Shape of the dense array as a tuple.

Convert in python using scipy.sparse.csr_array((x["values"], (x["row"], x["col"])), shape=x["shape"]).

Project details

Release history Release notifications | RSS feed

0.0.20

Feb 22, 2026

0.0.19

Feb 20, 2026

0.0.18

Feb 20, 2026

0.0.17

Feb 20, 2026

0.0.16

Feb 9, 2026

0.0.15

Feb 9, 2026

0.0.14

Jan 11, 2026

0.0.13

Jan 11, 2026

0.0.12

Dec 10, 2025

0.0.11 yanked

Dec 10, 2025

0.0.10 yanked

Dec 9, 2025

0.0.9 yanked

Nov 18, 2025

0.0.8 yanked

Nov 17, 2025

0.0.7 yanked

Nov 16, 2025

0.0.6 yanked

Nov 6, 2025

0.0.5 yanked

Nov 6, 2025

0.0.4 yanked

Nov 6, 2025

0.0.3 yanked

Nov 6, 2025

0.0.2 yanked

Oct 31, 2025

This version

0.0.1 yanked

Oct 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gwseq_io-0.0.1.tar.gz (49.0 kB view details)

Uploaded Oct 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gwseq_io-0.0.1-cp313-cp313-macosx_11_0_arm64.whl (450.6 kB view details)

Uploaded Oct 26, 2025 CPython 3.13macOS 11.0+ ARM64

File details

Details for the file gwseq_io-0.0.1.tar.gz.

File metadata

Download URL: gwseq_io-0.0.1.tar.gz
Upload date: Oct 26, 2025
Size: 49.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gwseq_io-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`4b542aa0321b223d68fbfb2091fedcc4fb38f6c52ed9df46598726611a76427b`
MD5	`8303c5f1ad44816998fdc8f333f07363`
BLAKE2b-256	`7971401fb839860f2246f8a86a215ab9f9ce4287f5c16ecf0b29ced96ca7cc56`

See more details on using hashes here.

File details

Details for the file gwseq_io-0.0.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

Download URL: gwseq_io-0.0.1-cp313-cp313-macosx_11_0_arm64.whl
Upload date: Oct 26, 2025
Size: 450.6 kB
Tags: CPython 3.13, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gwseq_io-0.0.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`69b4bdc5bffaa990a1408a034501b47e9ec38bffc8fe2b52b2ed90a364876b59`
MD5	`ff46b38a9892adbf1749e26c15f0c0fc`
BLAKE2b-256	`11008700d64c56113a2db5350177a3f78c54e1c3a9202974e3dcec8dcf1de8bc`

See more details on using hashes here.

gwseq-io 0.0.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Project description

Installation

Usage

Open bigWig, bigBed and HiC files

Read bigWig and bigBed signal

Quantify bigWig and bigBed signal

Profile bigWig and bigBed signal

Read bigBed entries

Convert bigWig to bedGraph or WIG

Convert bigBed to BED

Write bigWig file

Write bigBed file

Read HiC signal

Read sparse signal

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes