Skip to main content

Make simple, pretty Sankey Diagrams as a matplotlib object

Project description

pySankey2

Uses matplotlib to create simple Sankey diagrams flowing from left to right.

A fork of a fork of pySankey.

PyPI version Build Status Coverage Status License: GPL v3

Requirements

Requires python-tk (for python 2.7) or python3-tk (for python 3.x) install with apt-get or your package manager.

You can install the other requirements with:

    pip install -r requirements.txt

Examples

With fruits.txt :

true predicted
0 blueberry orange
1 lime orange
2 blueberry lime
3 apple orange
... ... ...
996 lime orange
997 blueberry orange
998 orange banana
999 apple lime

1000 rows × 2 columns

You can generate a sankey's diagram with this code (colorDict is optional):

import pandas as pd
import matplotlib.pyplot as plt

from pysankey import sankey

df = pd.read_csv(
    'pysankey/tests/fruits.txt', sep=' ', names=['true', 'predicted']
)
colorDict = {
    'apple':'#f71b1b',
    'blueberry':'#1b7ef7',
    'banana':'#f3f71b',
    'lime':'#12e23f',
    'orange':'#f78c1b',
    'kiwi':'#9BD937'
}

ax = sankey(
    df['true'], df['predicted'], aspect=20, colorDict=colorDict,
    leftLabels=['banana','orange','blueberry','apple','lime'],
    rightLabels=['orange','banana','blueberry','apple','lime','kiwi'],
    fontsize=12
)

plt.show() # to display
plt.savefig('fruit.png', bbox_inches='tight') # to save

With customer-goods.csv :

,customer,good,revenue
0,John,fruit,5.5
1,Mike,meat,11.0
2,Betty,drinks,7.0
3,Ben,fruit,4.0
4,Betty,bread,2.0
5,John,bread,2.5
6,John,drinks,8.0
7,Ben,bread,2.0
8,Mike,bread,3.5
9,John,meat,13.0

You could also weight:

import pandas as pd
import matplotlib.pyplot as plt

from pysankey import sankey

df = pd.read_csv(
    'pysankey/tests/customers-goods.csv', sep=',',
    names=['id', 'customer', 'good', 'revenue']
)
weight = df['revenue'].values[1:].astype(float)

ax = sankey(
      left=df['customer'].values[1:], right=df['good'].values[1:],
      rightWeight=weight, leftWeight=weight, aspect=20, fontsize=20
)

plt.show() # to display
plt.savefig('customers-goods.png', bbox_inches='tight') # to save

Similar to seaborn, you can pass a matplotlib Axes to sankey function with the keyword ax=:

import pandas as pd
import matplotlib.pyplot as plt

from pysankey import sankey

df = pd.read_csv(
        'pysankey/tests/fruits.txt',
        sep=' ', names=['true', 'predicted']
)
colorDict = {
    'apple': '#f71b1b',
    'blueberry': '#1b7ef7',
    'banana': '#f3f71b',
    'lime': '#12e23f',
    'orange': '#f78c1b'
}

ax1 = plt.axes()

ax1 = sankey(
      df['true'], df['predicted'], aspect=20, colorDict=colorDict,
      fontsize=12, ax=ax1
)

plt.show()

Important informations

Use of figureName, closePlot and figSize in sankey() has been removed. This is done so matplotlib is used more transparently as this [issue] suggested (https://github.com/anazalea/pySankey/issues/26#issue-429312025) on the original github repo.

Now, sankey() does less of the customization and let the user do it to their liking by returning a matplotlib Axes object, which mean the user also has access to the Figure to customise. Then they can choose what to do with it - showing it, saving it with much more flexibility.

Recommended changes to your code from pySankey

  • To save a figure, after sankey(), one can simply do:
  plt.savefig("<figureName>.png", bbox_inches="tight", dpi=150)
  • To display the diagram, simply do plt.show() after sankey().

  • You can modify the sankey size by changing the one from the matplotlib figure.

      plt.gcf().set_size_inches(figSize)
    
  • It is possible to modify the diagram font looks, for example, add the following lines before calling sankey() :

      plt.rc("text", usetex=False)
      plt.rc("font", family="serif")
    

Package development

Lint

pylint pysankey

Testing

python -m unittest

Coverage

coverage run -m unittest
coverage html
# Open htmlcov/index.html in a navigator

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pySankey2-2.0.1.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

pySankey2-2.0.1-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file pySankey2-2.0.1.tar.gz.

File metadata

  • Download URL: pySankey2-2.0.1.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.6.8

File hashes

Hashes for pySankey2-2.0.1.tar.gz
Algorithm Hash digest
SHA256 d22a8361269a2eadc5226a6cb430208fb0f79f0c3d62bd9afc89087aad150ba6
MD5 3ed4ff91f5aa1c112fca55c947de10ae
BLAKE2b-256 fb2e73a768a9949b2e174be34ac79530d4709c76a8af87a690cf3f986f45fab8

See more details on using hashes here.

File details

Details for the file pySankey2-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: pySankey2-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.6.8

File hashes

Hashes for pySankey2-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 07d9ae117b9ac8c2be0350373728d65e45bdabb12bd701c305c78e0f7ace9a33
MD5 4e9e1b27c08d712eaf2aa626ab942fbd
BLAKE2b-256 dba662bb97bfefd43216fff10345a2c3763613fb61dbed71defcff337f764d4d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page