Tools for working with genomic intervals and sequences
Project description
viewtools
Tools for rearranging genomic sequences and coordinates using bioframe-style view files.
Features
- 🧬 Genome Rearrangement: Extract, concatenate, and reverse complement genomic regions
- 📍 Coordinate Transformation: Remap genomic intervals (BED files) to match rearranged assemblies
- 🧵 Strand Handling: Automatic strand orientation for both sequences and coordinates
- 🔄 Flexible I/O: Support for stdin/stdout, gzip compression, and multiple file formats
- 🐍 Python API: Programmatic access with pandas DataFrames
- ⚡ Fast: Built on bioframe for efficient genomic interval operations
Installation
# Using uv (recommended)
uv pip install viewtools
# Using pip
pip install viewtools
Development Installation
git clone https://github.com/phlya/viewtools.git
cd viewtools
uv pip install -e ".[dev]"
Quick Start
Command Line
Rearrange a genome
# Create a view file (TSV)
cat > view.tsv << EOF
chrom start end name strand new_chrom
chr1 1000000 2000000 region1 + custom_chr1
chr2 500000 1500000 region2 - custom_chr1
EOF
# Rearrange genome
viewtools rearrange-genome genome.fasta --view view.tsv --out custom_genome.fasta
Rearrange BED coordinates
# Rearrange genomic intervals to match the new assembly
viewtools rearrange-bedframe intervals.bed --view view.tsv --out rearranged.bed
# Use with pipes
cat intervals.bed | viewtools rearrange-bedframe --view view.tsv | head
Python API
import pandas as pd
from viewtools.core.utils import read_fastas, read_view, write_fasta
from viewtools.api.rearrange import rearrange_genome, rearrange_bedframe
# Rearrange genome sequences
sequences = read_fastas(["genome.fasta"])
view = read_view("view.tsv")
custom_sequences = rearrange_genome(sequences, view, out_name_col="new_chrom")
write_fasta(custom_sequences, "custom_genome.fasta")
# Rearrange BED coordinates
bedframe = pd.read_csv("intervals.bed", sep="\t")
rearranged = rearrange_bedframe(bedframe, view, out_name_col="new_chrom")
rearranged.to_csv("rearranged.bed", sep="\t", index=False)
View File Format
View files are TSV/CSV files that define how to rearrange genomic regions:
Required columns:
chrom: Source chromosome namestart: Start position (0-based)end: End position (exclusive)new_chrom: Target chromosome name (or custom column via--out-name-col)
Optional columns:
name: Region namestrand: Orientation (+or-)
Example:
chrom start end name strand new_chrom
chr1 0 1000000 seg1 + custom1
chr1 2000000 3000000 seg2 - custom1
chr2 0 1000000 seg3 + custom2
Commands
rearrange-genome
Build a custom reference FASTA from input FASTA(s) using a bioframe-style view file.
viewtools rearrange-genome [OPTIONS] FASTA...
Options:
--view, -v PATH: View table path (required)--out, -o PATH: Output FASTA path, use '-' for stdout (required)--only-modified, -m: Only write contigs mentioned in the view--chroms, -c TEXT: Restrict output to specific chromosomes--sep, -s TEXT: Separator used in view file (default: tab)
Examples:
# Basic usage
viewtools rearrange-genome genome.fasta --view regions.tsv --out custom.fasta
# Multiple input files
viewtools rearrange-genome chr*.fasta --view regions.tsv --out custom.fasta
# Output to stdout and pipe
viewtools rearrange-genome genome.fasta --view regions.tsv --out - | gzip > custom.fasta.gz
rearrange-bedframe
Rearrange BED-like coordinates according to a bioframe-style view file.
viewtools rearrange-bedframe [OPTIONS] [BEDFRAME]
Options:
--view, -v PATH: View table path (required)--out, -o PATH: Output path, use '-' for stdout (default: stdout)--out-name-col, -n TEXT: Column name for new chromosome names (default: 'new_chrom')--split-overlaps/--no-split-overlaps: Split intervals overlapping multiple segments (default: True)--sep, -s TEXT: Separator for input and view files (default: tab)
Examples:
# Read from file, write to file
viewtools rearrange-bedframe intervals.bed --view view.tsv --out rearranged.bed
# Use pipes (stdin/stdout)
cat intervals.bed | viewtools rearrange-bedframe --view view.tsv > rearranged.bed
# Don't split overlapping intervals
viewtools rearrange-bedframe intervals.bed --view view.tsv --no-split-overlaps
# Integrate with bedtools
cat intervals.bed | viewtools rearrange-bedframe --view view.tsv | \
bedtools intersect -a stdin -b features.bed
Use Cases
1. Create Custom Reference Genomes
Extract and concatenate specific genomic regions to create custom reference assemblies:
# Extract centromeric regions from multiple chromosomes
viewtools rearrange-genome genome.fasta \
--view centromeres.tsv \
--out centromeric_assembly.fasta \
--only-modified
2. Generate Reverse Complement Sequences
# Reverse complement specific regions
echo -e "chr1\t0\t1000000\trc_region\t-\tchr1_rc" > reverse.tsv
viewtools rearrange-genome genome.fasta --view reverse.tsv --out rc.fasta
3. Update Genomic Annotations
After rearranging a genome, update BED files, gene annotations, or other interval-based data:
# Rearrange genome
viewtools rearrange-genome genome.fasta --view regions.tsv --out custom.fasta
# Rearrange corresponding gene annotations
viewtools rearrange-bedframe genes.bed --view regions.tsv --out custom_genes.bed
# Rearrange ChIP-seq peaks
viewtools rearrange-bedframe peaks.bed --view regions.tsv --out custom_peaks.bed
4. Strand-Aware Coordinate Transformation
The tool automatically handles strand orientation:
# Input: intervals with strand information
# View: segments with strand orientation
# Output: Combined strand logic (same=+, opposite=-)
viewtools rearrange-bedframe stranded_intervals.bed \
--view stranded_view.tsv \
--out transformed.bed
Documentation
Full documentation is available at: https://viewtools.readthedocs.io/
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Running Tests
# Install dev dependencies
uv pip install -e ".[dev]"
# Run tests
pytest
# Run with coverage
pytest --cov=viewtools --cov-report=html
# Run linting
ruff check .
black --check .
License
MIT License - see LICENSE file for details.
Citation
If you use viewtools in your research, please cite this repository
Acknowledgments
- Built with bioframe for genomic interval operations
- Inspired by the need for flexible genome rearrangement in Hi-C and other genomics workflows
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file viewtools-0.1.1.tar.gz.
File metadata
- Download URL: viewtools-0.1.1.tar.gz
- Upload date:
- Size: 28.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6254b3c53b7c02f37c970a7bf0cb036c1e4835c42a4112f5a09c97c38908ef2c
|
|
| MD5 |
af90a3b84ac3961e3735bff0b3bf08c3
|
|
| BLAKE2b-256 |
d8132660bfba3146f8cb598fe926b3ef5ed49a3a5956b5409d07295975f1f377
|
File details
Details for the file viewtools-0.1.1-py3-none-any.whl.
File metadata
- Download URL: viewtools-0.1.1-py3-none-any.whl
- Upload date:
- Size: 15.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d07b6efd9bbc144ce16dff1bdccf318bd63ff8471e8e811e0d275d540d8de9a0
|
|
| MD5 |
31c71905a8a3280f2dddb8a552311705
|
|
| BLAKE2b-256 |
0cfb2a885f95b37c69962bac348fbdb8af362d7983dc8d455760b43c158c34f5
|