Skip to main content

Plot features from DNA sequences (e.g. Genbank) with Python

Project description

Dna Features Viewer is a Python library to (wait for it…) visualize DNA features, e.g. from GenBank or Gff files, using the plotting library Matplotlib:

https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaFeaturesViewer/master/examples/by_hand.png

Dna Features Viewer is fairly minimal (<200 lines of code) but can display sequences with lots of overlapping features and long labels, without getting too messy. The plots can be output to many different formats (PNG, JPEG, SVG, PDF).

License

Dna Features Viewer is an open-source software originally written at the Edinburgh Genome Foundry by Zulko and released under the MIT licence. Everyone is welcome to contribute !

Installation

Dna Features Viewer can be installed by unzipping the source code in one directory and using this command:

sudo python setup.py install

PIP install is coming soon !

Examples of use

Defining the features by hand

In this first example we define features “by hand”:

from dna_features_viewer import GraphicFeature, GraphicRecord
features=[
    GraphicFeature(start=0, end=20, strand=+1, color="#ffd700",
                   label="Small feature"),
    GraphicFeature(start=20, end=500, strand=+1, color="#ffcccc",
                   label="Gene 1 with a very long name"),
    GraphicFeature(start=400, end=700, strand=-1, color="#cffccc",
                   label="Gene 2"),
    GraphicFeature(start=600, end=900, strand=+1, color="#ccccff",
                   label="Gene 3")
]
record = GraphicRecord(sequence_length=1000, features=features)
record.plot(fig_width=5)
https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaFeaturesViewer/master/examples/by_hand.png

Reading the features from a GenBank file

DnaFeaturesViewer plays nice with BioPython. As a result it is super easy to plot the content of a GenBank file:

from dna_features_viewer import GraphicRecord
from Bio import SeqIO
with open("./plasmid.gb", "r") as f:
    record = SeqIO.read(f, "genbank")
graphic_record = GraphicRecord.from_biopython_record(record)
graphic_record.plot(fig_width=10)
https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaFeaturesViewer/master/examples/from_genbank.png

Displaying the features along with other plots

As it uses Matplotlib, Dna Features Viewer can display the features on top of other sequences statistics, such as the local GC content:

import matplotlib.pyplot as plt
from dna_features_viewer import GraphicRecord
from Bio import SeqIO
import numpy as np

figure_width = 10
fig, (ax1, ax2) = plt.subplots(2,1, figsize=(figure_width,5), sharex=True)

# Parse the genbank file, plot annotations
with open("./plasmid.gb", "r") as f:
    record = SeqIO.read(f, "genbank")
graphic_record = GraphicRecord.from_biopython_record(record)
_, max_y = graphic_record.plot(ax=ax1m , with_ruler=False)

# Plot the local GC content
def plot_local_gc_content(record, window_size, ax):
    gc_content = lambda s: 1.0*len([c for c in s if c in "GC"]) / len(s)
    yy = [gc_content(record.seq[i:i+window_size])
          for i in range(len(record.seq)-window_size)]
    xx = np.arange(len(record.seq)-window_size)+25
    ax.fill_between(xx, yy, alpha=0.3)
plot_local_gc_content(record, window_size=50, ax=ax2)

# Resize the figure
fig.set_size_inches(figure_width, 2 + 0.4*(max_y+2))
https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaFeaturesViewer/master/examples/with_plot.png

Dna Features Viewer is pretty minimal in terms of features but easily extensible since it uses Matplotlib as a backend.

Bonus

As a bonus, here is what to expect when you feed it with a pathologically annotated Genbank file:

https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaFeaturesViewer/master/examples/example_overloaded.png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dna_features_viewer-0.1.0.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dna_features_viewer-0.1.0-py2-none-any.whl (10.3 kB view details)

Uploaded Python 2

File details

Details for the file dna_features_viewer-0.1.0.tar.gz.

File metadata

File hashes

Hashes for dna_features_viewer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3ea5d7ee7d076d18ebb06d6e92bc21ebe82cdcd967f5618bfb16495ede54e8cf
MD5 21c433397760ddba567848ec80c7788f
BLAKE2b-256 bc9d25ce78d8d640696f8cb3e4293dfc28b05edd5df138a9225450f8a590e1df

See more details on using hashes here.

File details

Details for the file dna_features_viewer-0.1.0-py2-none-any.whl.

File metadata

File hashes

Hashes for dna_features_viewer-0.1.0-py2-none-any.whl
Algorithm Hash digest
SHA256 fb224f5d124a6b49f1ba8c9157f3bdd0422d56fafee28bf07c9160fec6c392ab
MD5 396330c5efbed98d4da041ff964a1bf7
BLAKE2b-256 581488e7f69632d2365ad0afd663dcbf50bd5a000323197289858623a7af0b9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page