Genotations - python library to work with genomes and primers
Project description
Genotations
Python library to work with genomes and annotations, mostly Ensembl genomes. Also supports visualization of transcripts/gene features and primer selection. As pandas and polars are libraries of everyday use for many python developers this library focus on annotations representation in a dataframe way.
The library allows:
- downloading Ensembl annotations and genomes (uses genomepy under the hood)
- working with genomic annotations like with polars dataframes
- getting sequences for selected genes
- visualizing the genes features
- designing primers for selected transcripts with Primer3 python wrapper
Usage
Install with pip:
pip install genotations
Now you can start using it, for example:
from genotations import ensembl
human = ensembl.human # getting human genome
mouse = ensembl.mouse # getting mosue genome
mouse.annotations.exons().annotations_df # getting exons as DataFrame
mouse.annotations.protein_coding().exons().annotations_df # getting exons of protein coding genes
mouse.annotations.transcript_gene_names_df # getting transcript gene names
mouse.annotations.with_gene_name_contains("Foxo1").protein_coding().transcripts() #getting only coding Foxo1 transcripts
mouse.annotations.with_gene_name_contains("Foxo1").genes_visual(mouse.genome)[0].plot() # plotting features of the Foxo1 gene
cow_assemblies = ensembl.search_assemblies("Bos taurus") # you can also search genomes by species name if it exists in Ensembl
cow1 = ensembl.SpeciesInfo("Cow", cow_assemblies[-1][0]) # selecting one of several cow assemblies
cow1.annotations.annotations_df # getting annotations as dataframe
You can also use the library to annotate existing gene expression data with gene and transcript symbols and features. For example
from genotations.quantification import *
from genotations import ensembl
base = "."
examples = base / "examples"
data = examples / "data"
expressions = pl.read_parquet(str(data / "PRJNA543661_transcripts.parquet"))
with_expressions_summaries(expressions, min_avg_value = 1)
expressions_ext = ensembl.mouse.annotations.extend_with_annotations_and_sequences(expressions, ensembl.mouse.genome) # extend expression data with annotations and sequences
For more examples, check example notebook to see the usage and API
Working with the library code
Use micromamba (or conda) and environment.yaml to install the dependencies
micromamba create -f environment.yaml
micromamba activate genotations
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file genotations-0.1.4.tar.gz
.
File metadata
- Download URL: genotations-0.1.4.tar.gz
- Upload date:
- Size: 23.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e379f6b03e086eb935621a319c9f3ca7bbde34bd8afc4a72b4b23734c15c92a0 |
|
MD5 | 8d36a928718fb15204c4478f3feb0f85 |
|
BLAKE2b-256 | cda9220ed579dd1f7c26535802e416374aaa0f6498ff088e1a71113387b6db74 |
File details
Details for the file genotations-0.1.4-py2.py3-none-any.whl
.
File metadata
- Download URL: genotations-0.1.4-py2.py3-none-any.whl
- Upload date:
- Size: 23.4 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | df582813cf141f69631020b09559a3274b5832c377eb080ddce7aeabf634596c |
|
MD5 | a0c1a13d5e441018acdf5ff8bac83261 |
|
BLAKE2b-256 | 8e55ff4fb5925605aa71079c6bce75514aab86fad5e754d4d618350050c865d6 |