'Gene visualization package for dataframe objects generated with PyRanges.'
Project description
pyranges_plot
Gene visualization package for dataframe objects generated with PyRanges.
Overview
The goal is getting a plot displaying a series of genes contained in a dataframe from a PyRanges object. It displays the genes in its corresponding chromosome subplot. The user can choose whether the plot is based on Matplotlib or Plotly by setting the engine. The plot will not contain all the genes in the dataframe, by default it shows 25 genes but this number can be customized too. It is worth noting that the order of the genes will be maintained.
Pyranges plot offers a wide versatility for coloring. The data feature (column) according to which the genes will be colored is by default the gene ID, but this “color column” can be selected manually. Color specifications can be left as the default colormap or be provided as dictionaries, lists and color objects from either Matplotlib or Plotly regardless of the chosen engine. When a colormap or list of colors is specified, the color of the genes will iterate over the given colors following the color column pattern. In the case of concrete color instructions such as dictionary, the genes will be colored according to it while the non-specified ones will be colored in black(??).
Installation
PyRanges-Plot can be installed using pip:
pip install pyranges-plot
Examples
Next we will test pyranges_plot visualization options, using some data provided in PyRanges tutorial. Download and unpack tutorial data with:
curl -O https://mariottigenomicslab.bio.ub.edu/pyranges_data/pyranges_tutorial_data.tar.gz
tar zxf pyranges_tutorial_data.tar.gz
Once we have data files to work with in our working directory we will initiate python to load and subset the data into a dataframe.
import pyranges as pr
import pyranges_plot as prplot
import pandas as pd
ann = pr.read_gff3('Dgyro_annotation.gff')
ann = ann[ [ 'ID'] ]
df = ann.df
df = df.rename(columns={'ID': 'gene_id'}) # ID column name to standard
Having some example data in the variable df
we can start exploring pyranges_plot options.
We can get our plot in a single line:
prplot.plot_exons(df, engine=’plt’)
In this plot we can see the top 25 genes in the dataframe in a Matplotlib plot, we just need to provide the data and engine. However the engine can be set previously so there is no need to specify it anymore while plotting:
# Use ‘plotly’ or ‘ply’ for Plotly plots and ‘matplotlib’ or ‘plt’ for Matplotlib plots
prplot.set_engine(‘plotly’)
prplot.plot_exons(df)
The plot looks the same as the previous one, but in this case is a Plotly plot. Note that in both libraries there are interactive zoom options. For Matplotlib…
and for Plotly.
We can try to color the genes according to strand and providing a dictionary for the colors, for that we will subset the dataframe to see 15 genes from each strand. In order to see all those genes we will set the max_ngenes to 30, since it is more than 25 genes a warning will appear:
df2 = pd.concat([df.loc[df['Strand'] == '+'].head(15), df.loc[df['Strand'] == '-'].head(15)])
prplot.plot_exons(df2, max_ngenes=30, color_column='Strand', colormap={'+': 'green', '-': 'red'})
Some features of appearance can also be customized. The way to change the default variables
is using the set_default
function. The background color, the plot border color or the title
color can be customized in the following way:
prplot.set_default('plot_background', 'black')
prplot.set_default('plot_border', 'lightblue')
prplot.set_default('title_dict_ply.color', ‘magenta’)
prplot.plot_exons(df)
Coming soon
- Bases will be displayed along coordinates
- Colorblind friendly
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pyranges_plot-0.0.4.tar.gz
.
File metadata
- Download URL: pyranges_plot-0.0.4.tar.gz
- Upload date:
- Size: 13.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ee72b59dc888f390d4c6d2f639a0603d7ace869f42c0fc5e839d81430f0ae89 |
|
MD5 | 2e6b9a4244eed6f2933a06a0fa1a8764 |
|
BLAKE2b-256 | 45e82d3affad963d6f51503f71b829a8e3215e1e1525dc036b8fbc75e2fdbd63 |