Skip to main content

'Gene visualization package for dataframe objects generated with PyRanges.'

Project description

pyranges_plot

Gene visualization package for dataframe objects generated with PyRanges.

Overview

The goal is getting a plot displaying a series of genes contained in a dataframe from a PyRanges object. It displays the genes in its corresponding chromosome subplot. The user can choose whether the plot is based on Matplotlib or Plotly by setting the engine. The plot will not contain all the genes in the dataframe, by default it shows 25 genes but this number can be customized too. It is worth noting that the order of the genes will be maintained.

Pyranges plot offers a wide versatility for coloring. The data feature (column) according to which the genes will be colored is by default the gene ID, but this “color column” can be selected manually. Color specifications can be left as the default colormap or be provided as dictionaries, lists and color objects from either Matplotlib or Plotly regardless of the chosen engine. When a colormap or list of colors is specified, the color of the genes will iterate over the given colors following the color column pattern. In the case of concrete color instructions such as dictionary, the genes will be colored according to it while the non-specified ones will be colored in black(??).

Installation

PyRanges-Plot can be installed using pip:

pip install pyranges-plot

Examples

Next we will test pyranges_plot visualization options, using some data provided in PyRanges tutorial. Download and unpack tutorial data with:

curl -O https://mariottigenomicslab.bio.ub.edu/pyranges_data/pyranges_tutorial_data.tar.gz
tar zxf pyranges_tutorial_data.tar.gz

Once we have data files to work with in our working directory we will initiate python to load and subset the data into a dataframe.

import pyranges as pr
import pyranges_plot as prplot
import pandas as pd
ann = pr.read_gff3('Dgyro_annotation.gff')
ann = ann[ [ 'ID'] ]
df = ann.df
df = df.rename(columns={'ID': 'gene_id'}) # ID column name to standard

Having some example data in the variable df we can start exploring pyranges_plot options. We can get our plot in a single line:

prplot.plot_exons(df, engine=plt)

In this plot we can see the top 25 genes in the dataframe in a Matplotlib plot, we just need to provide the data and engine. However the engine can be set previously so there is no need to specify it anymore while plotting:

# Use ‘plotly’ or ‘ply’ for Plotly plots and ‘matplotlib’ or ‘plt’ for Matplotlib plots
prplot.set_engine(plotly)
prplot.plot_exons(df)

The plot looks the same as the previous one, but in this case is a Plotly plot. Note that in both libraries there are interactive zoom options. For Matplotlib…

and for Plotly.

We can try to color the genes according to strand and providing a dictionary for the colors, for that we will subset the dataframe to see 15 genes from each strand. In order to see all those genes we will set the max_ngenes to 30, since it is more than 25 genes a warning will appear:

df2 = pd.concat([df.loc[df['Strand'] == '+'].head(15), df.loc[df['Strand'] == '-'].head(15)])
prplot.plot_exons(df2, max_ngenes=30, color_column='Strand', colormap={'+': 'green', '-': 'red'})

Some features of appearance can also be customized. The way to change the default variables is using the set_default function. The background color, the plot border color or the title color can be customized in the following way:

prplot.set_default('plot_background', 'black')
prplot.set_default('plot_border', 'lightblue')
prplot.set_default('title_dict_ply.color', magenta)
prplot.plot_exons(df)

Coming soon

  • Bases will be displayed along coordinates
  • Colorblind friendly

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyranges_plot-0.0.4.tar.gz (13.9 kB view details)

Uploaded Source

File details

Details for the file pyranges_plot-0.0.4.tar.gz.

File metadata

  • Download URL: pyranges_plot-0.0.4.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for pyranges_plot-0.0.4.tar.gz
Algorithm Hash digest
SHA256 2ee72b59dc888f390d4c6d2f639a0603d7ace869f42c0fc5e839d81430f0ae89
MD5 2e6b9a4244eed6f2933a06a0fa1a8764
BLAKE2b-256 45e82d3affad963d6f51503f71b829a8e3215e1e1525dc036b8fbc75e2fdbd63

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page