Gene-backed peak-to-annotation BED resources for SJ/CAB workflows.
Project description
sjcab_peak2anno_db
Gene-backed peak-to-annotation BED resources for SJ/CAB workflows.
The package bundles gene BED files to keep the distribution small. It can
generate these derived annotations from each bundled gene version:
tss: 1 bp TSS intervals, strand-awaretes: 1 bp TES intervals, strand-awarededuplong: longest isoform per gene name, using column 4 as gene name and column 5 as isoform length
The package also bundles blacklist BED files. During install, each blacklist is
written under ~/.sjcab_peak2anno_db/blacklists as a dated
*.bed.20230411 file, with the current *.bed name refreshed as a symlink.
CpG island (CGI) BED files for hg38, hg19, mm10, mm9, and mm39 are
also bundled and installed into ~/.sjcab_peak2anno_db/cgi.
Supported species:
hg19hg38mm10mm9sacCer3
Install
From TestPyPI:
python -m pip install -i https://test.pypi.org/simple/ sjcab_peak2anno_db
From a local checkout:
python -m pip install .
Generate User Data
After package installation, generate all available versions under
~/.sjcab_peak2anno_db:
sjcab-peak2anno-db install
That command installs the packaged gene BEDs, blacklists, and CGI BEDs into the cache directory.
To use a different data directory, set SJCAB_PEAK2ANNO_DB_PATH:
export SJCAB_PEAK2ANNO_DB_PATH=/path/to/sjcab_peak2anno_db
sjcab-peak2anno-db install
You can also pass --data-dir for one command:
sjcab-peak2anno-db install --data-dir /path/to/sjcab_peak2anno_db
Generated files are written as:
{data_dir}/{species}/{annotation}/{version}.bed
{data_dir}/{species}/{annotation}/default.bed
{data_dir}/blacklists/{name}.bed.20230411
{data_dir}/blacklists/{name}.bed
default.bed points to the latest parsed version for that species and
annotation.
Download UCSC CpG island BED files for hg38, hg19, mm10, mm9, and
mm39:
sjcab-peak2anno-db download-cgi
CGI files are written as:
{data_dir}/cgi/{species}_cgi.bed
Python Usage
import sjcab_peak2anno_db as db
print(db.supported_species())
print(db.versions("hg38", "gene"))
print(db.default_version("hg19", "tss"))
db.install_data()
print(db.path("hg38", "tss"))
Generate one derived file manually:
import sjcab_peak2anno_db as db
db.write_tss("genes.bed", "genes.tss.bed")
db.write_tes("genes.bed", "genes.tes.bed")
db.write_deduplong("genes.bed", "genes.deduplong.bed")
Command Line
sjcab-peak2anno-db list
sjcab-peak2anno-db install
sjcab-peak2anno-db install-blacklists
sjcab-peak2anno-db download-cgi
sjcab-peak2anno-db update
sjcab-peak2anno-db path hg38 gene
sjcab-peak2anno-db path hg38 tss --install
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sjcab_peak2anno_db-0.1.1.tar.gz.
File metadata
- Download URL: sjcab_peak2anno_db-0.1.1.tar.gz
- Upload date:
- Size: 22.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4afc865f30f565e3fdd5958c9878f29ccfba1422473f4f5f7afb3799059469fa
|
|
| MD5 |
8d138ef3fda112b1b8ca850c50530390
|
|
| BLAKE2b-256 |
2db4b4520f7a72e4333515f9b5fb9ec254aafec4901f4d8b25bf81470e93d39b
|
File details
Details for the file sjcab_peak2anno_db-0.1.1-py3-none-any.whl.
File metadata
- Download URL: sjcab_peak2anno_db-0.1.1-py3-none-any.whl
- Upload date:
- Size: 22.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa1aa8623a68ae6fff4f4e3ce143028e0b07a002663bee661a2a7c6a951421b4
|
|
| MD5 |
8150e8fc1e579bdcb3bbf0720de41d0a
|
|
| BLAKE2b-256 |
caa5faabd0069ae0fedf40d881f418a903169c222628ce014f003d7d48ebade9
|