Skip to main content

Bioinfomatics File access tools

Project description

This is a collection of scripts and modules for bioinfomatics file access

Modules, Classes, and Functions

xzFile, xzopen()

access to various compressed files, currently recoganize gzip (.gz), bz2(.bz2), and bgzip(.bgz, .b.gz) from samtools package

tsvFile, tsvRecord, tsv

tab seperated file with named fields, user could also defined some preprocess functions for field reading and writing

vcfFile, vcf

vcf file access, depends on PyVCF, yet provide a convinient and flexable interface

samFile, sam

sam file access, based on pysam. pysam also provides interface for tabix (random access tsv file with genome positions), which could be access from BioUtil.sam

fastqFile, fastaFile:

fasta/fastq file IO. based on lh3 readfq.

cachedFasta

fetch region sequence from large fasta file. This module is based on faidx through pysam pysam.FastaFile. from v0.1.2: old name fastaReader is deprecated as misleading with fastaFile reader

faidx

experimental, interface to pyfaidx.

Dependency

Change Log

v0.4

add logger class

v0.3

change fasta/fastq Writter methods

v0.2

add fastqFile, rename fastaReader to cachedFasta

v0.1.1

add fastaReader

v0.1.0

inital release, support xzFile, tsvFile, vcfFile, samFile and faidx

Authors

Yu XU <xuyu@genomics.cn>

Lisense

This module is under GPLv2 Lisense

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

BioUtil-0.4.0.tar.gz (11.6 kB view hashes)

Uploaded Source

Built Distribution

BioUtil-0.4.0-py3.6.egg (31.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page