Skip to main content

Tools for working with genomic intervals and sequences

Project description

viewtools

Tests PyPI version Python 3.8+

Tools for rearranging genomic sequences and coordinates using bioframe-style view files.

Features

  • 🧬 Genome Rearrangement: Extract, concatenate, and reverse complement genomic regions
  • 📍 Coordinate Transformation: Remap genomic intervals (BED files) to match rearranged assemblies
  • 🧵 Strand Handling: Automatic strand orientation for both sequences and coordinates
  • 🔄 Flexible I/O: Support for stdin/stdout, gzip compression, and multiple file formats
  • 🐍 Python API: Programmatic access with pandas DataFrames
  • Fast: Built on bioframe for efficient genomic interval operations

Installation

# Using uv (recommended)
uv pip install viewtools

# Using pip
pip install viewtools

Development Installation

git clone https://github.com/phlya/viewtools.git
cd viewtools
uv pip install -e ".[dev]"

Quick Start

Command Line

Rearrange a genome

# Create a view file (TSV)
cat > view.tsv << EOF
chrom	start	end	name	strand	new_chrom
chr1	1000000	2000000	region1	+	custom_chr1
chr2	500000	1500000	region2	-	custom_chr1
EOF

# Rearrange genome
viewtools rearrange-genome genome.fasta --view view.tsv --out custom_genome.fasta

Rearrange BED coordinates

# Rearrange genomic intervals to match the new assembly
viewtools rearrange-bedframe intervals.bed --view view.tsv --out rearranged.bed

# Use with pipes
cat intervals.bed | viewtools rearrange-bedframe --view view.tsv | head

Python API

import pandas as pd
from viewtools.core.utils import read_fastas, read_view, write_fasta
from viewtools.api.rearrange import rearrange_genome, rearrange_bedframe

# Rearrange genome sequences
sequences = read_fastas(["genome.fasta"])
view = read_view("view.tsv")
custom_sequences = rearrange_genome(sequences, view, out_name_col="new_chrom")
write_fasta(custom_sequences, "custom_genome.fasta")

# Rearrange BED coordinates
bedframe = pd.read_csv("intervals.bed", sep="\t")
rearranged = rearrange_bedframe(bedframe, view, out_name_col="new_chrom")
rearranged.to_csv("rearranged.bed", sep="\t", index=False)

View File Format

View files are TSV/CSV files that define how to rearrange genomic regions:

Required columns:

  • chrom: Source chromosome name
  • start: Start position (0-based)
  • end: End position (exclusive)
  • new_chrom: Target chromosome name (or custom column via --out-name-col)

Optional columns:

  • name: Region name
  • strand: Orientation (+ or -)

Example:

chrom	start	end	name	strand	new_chrom
chr1	0	1000000	seg1	+	custom1
chr1	2000000	3000000	seg2	-	custom1
chr2	0	1000000	seg3	+	custom2

Commands

rearrange-genome

Build a custom reference FASTA from input FASTA(s) using a bioframe-style view file.

viewtools rearrange-genome [OPTIONS] FASTA...

Options:

  • --view, -v PATH: View table path (required)
  • --out, -o PATH: Output FASTA path, use '-' for stdout (required)
  • --only-modified, -m: Only write contigs mentioned in the view
  • --chroms, -c TEXT: Restrict output to specific chromosomes
  • --sep, -s TEXT: Separator used in view file (default: tab)

Examples:

# Basic usage
viewtools rearrange-genome genome.fasta --view regions.tsv --out custom.fasta

# Multiple input files
viewtools rearrange-genome chr*.fasta --view regions.tsv --out custom.fasta

# Output to stdout and pipe
viewtools rearrange-genome genome.fasta --view regions.tsv --out - | gzip > custom.fasta.gz

rearrange-bedframe

Rearrange BED-like coordinates according to a bioframe-style view file.

viewtools rearrange-bedframe [OPTIONS] [BEDFRAME]

Options:

  • --view, -v PATH: View table path (required)
  • --out, -o PATH: Output path, use '-' for stdout (default: stdout)
  • --out-name-col, -n TEXT: Column name for new chromosome names (default: 'new_chrom')
  • --split-overlaps/--no-split-overlaps: Split intervals overlapping multiple segments (default: True)
  • --sep, -s TEXT: Separator for input and view files (default: tab)

Examples:

# Read from file, write to file
viewtools rearrange-bedframe intervals.bed --view view.tsv --out rearranged.bed

# Use pipes (stdin/stdout)
cat intervals.bed | viewtools rearrange-bedframe --view view.tsv > rearranged.bed

# Don't split overlapping intervals
viewtools rearrange-bedframe intervals.bed --view view.tsv --no-split-overlaps

# Integrate with bedtools
cat intervals.bed | viewtools rearrange-bedframe --view view.tsv | \
    bedtools intersect -a stdin -b features.bed

Use Cases

1. Create Custom Reference Genomes

Extract and concatenate specific genomic regions to create custom reference assemblies:

# Extract centromeric regions from multiple chromosomes
viewtools rearrange-genome genome.fasta \
    --view centromeres.tsv \
    --out centromeric_assembly.fasta \
    --only-modified

2. Generate Reverse Complement Sequences

# Reverse complement specific regions
echo -e "chr1\t0\t1000000\trc_region\t-\tchr1_rc" > reverse.tsv
viewtools rearrange-genome genome.fasta --view reverse.tsv --out rc.fasta

3. Update Genomic Annotations

After rearranging a genome, update BED files, gene annotations, or other interval-based data:

# Rearrange genome
viewtools rearrange-genome genome.fasta --view regions.tsv --out custom.fasta

# Rearrange corresponding gene annotations
viewtools rearrange-bedframe genes.bed --view regions.tsv --out custom_genes.bed

# Rearrange ChIP-seq peaks
viewtools rearrange-bedframe peaks.bed --view regions.tsv --out custom_peaks.bed

4. Strand-Aware Coordinate Transformation

The tool automatically handles strand orientation:

# Input: intervals with strand information
# View: segments with strand orientation
# Output: Combined strand logic (same=+, opposite=-)
viewtools rearrange-bedframe stranded_intervals.bed \
    --view stranded_view.tsv \
    --out transformed.bed

Documentation

Full documentation is available at: https://viewtools.readthedocs.io/

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Running Tests

# Install dev dependencies
uv pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=viewtools --cov-report=html

# Run linting
ruff check .
black --check .

License

MIT License - see LICENSE file for details.

Citation

If you use viewtools in your research, please cite this repository

Acknowledgments

  • Built with bioframe for genomic interval operations
  • Inspired by the need for flexible genome rearrangement in Hi-C and other genomics workflows

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

viewtools-0.1.1.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

viewtools-0.1.1-py3-none-any.whl (15.1 kB view details)

Uploaded Python 3

File details

Details for the file viewtools-0.1.1.tar.gz.

File metadata

  • Download URL: viewtools-0.1.1.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for viewtools-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6254b3c53b7c02f37c970a7bf0cb036c1e4835c42a4112f5a09c97c38908ef2c
MD5 af90a3b84ac3961e3735bff0b3bf08c3
BLAKE2b-256 d8132660bfba3146f8cb598fe926b3ef5ed49a3a5956b5409d07295975f1f377

See more details on using hashes here.

File details

Details for the file viewtools-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: viewtools-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for viewtools-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d07b6efd9bbc144ce16dff1bdccf318bd63ff8471e8e811e0d275d540d8de9a0
MD5 31c71905a8a3280f2dddb8a552311705
BLAKE2b-256 0cfb2a885f95b37c69962bac348fbdb8af362d7983dc8d455760b43c158c34f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page