Skip to main content

floweaver extension to handle the path visualization

Project description

floweaver-path

Description

Demo

This library (floweaver-path) is an extension of the floweaver to handle the visualization of paths that pass through a selected node.

We focus on the visualization of longitudinal data.

The idea of our visualization is based on pathSankey, that is an extension of d3-sankey.

The color of paths that pass through a selected node is yellow-green (highlighted), and that of other paths is gray.

You can interactively select a node by using dropdowns in jupyter notebook.

We have two technical contributions to the field of visualization using Sankey diagrams.

One is to extend the layer number:

  • Ordinary Sankey diagrams can only visualize paths between 2 layers.
  • pathSankey can only visualize paths between 3 layers.
  • We can visualize the comparison of paths between two layers before and after (up to 5 layers).

The other is to create a notebook that can interact with users. We integrate several functions of ipywidgets into floweaver.

Requirement

installation using docker

  • docker (installs two libraries: floweaver(==2.0.0a5), ipysankeywidget==0.2.5)
  • input file (*.csv, *.pickle or *.xlsx should be put in interaction/data directory if you do not specify the data directory)

installation using pip

  • pip
  • python (3.6)
  • input file (*.csv, *.pickle or *.xlsx should be put in interaction/data directory if you do not specify the data directory)

Two libraries (floweaver>=2.0.0a5, ipysankeywidget>=0.2.5) will be installed.

Setup (when you install using docker)

build

scripts/build

run notebook

scripts/run-notebooks

Run docker, and connect interaction to work.

Data and notebooks are shared between a docker image and your local system.

use notebooks in browser

Open a new browser tab and type localhost: 10001 in the URL.

Copy and paste a token to use notebooks. The token you can use is displayed in your terminal as follows:

http://(<id> or 127.0.0.1):8888/?token=<token>

Setup (when you install using pip)

You can install floweaver_path by the ordinary installation command

pip install floweaver_path

You might need to execute the following commands

jupyter nbextension install --py widgetsnbextension --user
jupyter nbextension install --py ipysankeywidget --user
jupyter nbextension enable widgetsnbextension --user --py
jupyter nbextension enable ipysankeywidget --user --py

Usage

prepare data

If you install floweaver_path using docker, you need to put your local file under the interaction directory. You can use the jupyter notebook to upload your local file. You can also directly put your local file under the interaction directory.

If you install floweaver_path using pip, you do not need to move your local file because you can specify the all local paths.

We focus on longitudinal data. The format of your file should be as follows:

   index  date  value1  value2
0        1      2016/04/01   1   3
1        2      2016/10/01   3   2
2        1      2016/04/01   4   1
  • index: This variable is handled as user id.
  • date: This variable is handled as the date and visualized in the x-axis (in terms of sankey diagrams, layer). Data should not be duplicated with respect to a pair of (index, date).
  • value[n] (): These variables are handled as target variables. One of those variables is visualized in the y-axis (in terms of sankey diagrams, node in each layer).

The name of each variable can be changed between files. You can select which variable to use interactively.

Note that we support three types of file extensions: .csv, .xlsx and .pickle Please check the details of the data by loading data/template_data.csv.

launch a working notebook

The template notebook (template.ipynb) should not be changed. I recommend you to duplicate the template notebook and work on the duplicated notebook.

call the visualizer

You can import the visualizer and call it as follows.

from floweaver_path import visualizer
visualizer()

The visualizer function has 5 arguments:

  • data_dir (default='./data'): where you put your local files
  • width (default=1070): the width of visualized figures
  • height (default=500): the width of visualized figures
  • target_color (default='yellowgreen'): the color of paths that pass through a selected node.
  • base_color (default='gray'): the width of paths that do not pass through a selected node.

select a target node

We prepare 7 dropdowns for users to interact with floweaver.

  • multiple display?: whether this library displays multiple images or not.
  • file path: data to be analyzed.
  • index column: column name that contains id information (e.g., user_id).
  • date column: column name that contains date information (handled as a layer and visualized in the x-axis).
  • target varible: column name that you want to analyze (handled as a node in each layer and visualized in the y-axis).
  • target date: value name that you want to select as the value of your target date.
  • target value: value name that you want to select as the value of your target variable.

The dependence between the dropdowns is updated as soon as you select each value.

Authors

  • @fullflu proposed to create this library and prepared basic scripts.
  • @adamist created Dockerfile, build and run-notebook scripts.

Contributors

Please feel free to create issues or to contribute to floweaver-path! It would be useful to contribute to the original floweaver library.

License

MIT

Structure

├── Dockerfile
├── LICENSE
├── README.md
├── demo
│   └── floweaver_path_demo.gif
├── interaction
│   ├── data
│   │   └── template_data.csv
│   └── template.ipynb
├── requirements.txt
├── scripts
│   ├── build
│   └── run-notebook
├── setup.py
├── src
│   ├── floweaver_path
│   │   ├── __init__.py
│   │   ├── lib
│   │   │   ├── __init__.py
│   │   │   ├── ts_sankey.py
│   │   │   └── utils.py
│   │   └── visualizer.py
│   └── template
│       ├── data
│       │   └── template.csv
│       └── notebooks
│           └── template.ipynb
└── tests
    ├── test_extract_files.py
    └── test_load_file.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

floweaver-path-0.0.3.tar.gz (8.5 kB view details)

Uploaded Source

Built Distribution

floweaver_path-0.0.3-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file floweaver-path-0.0.3.tar.gz.

File metadata

  • Download URL: floweaver-path-0.0.3.tar.gz
  • Upload date:
  • Size: 8.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.1

File hashes

Hashes for floweaver-path-0.0.3.tar.gz
Algorithm Hash digest
SHA256 0fb0410256cbe6013f7e954dbdd6fc8ba85ea70c9ecb89ef178a2dcbae50f411
MD5 6ec56b558f179db7f8151746fe3aca7d
BLAKE2b-256 1c2907d3c0d6ec0421fbff5097e62bb4745ff0a54ae539d67ef13de403c87dba

See more details on using hashes here.

File details

Details for the file floweaver_path-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: floweaver_path-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.1

File hashes

Hashes for floweaver_path-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9ef1ac83eda303b6a5b0b60890f3307083eceb662075ab5e872d45ecd721e72a
MD5 59dd4b82983bef4f7cb10d242162e52b
BLAKE2b-256 67d51809d7126ba59b68a95bc592b2e236b21c213cb2b4e2dd3087ead60febeb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page