Skip to main content

Automatically Visualize any dataset, any size with a single line of code

Project description

AutoViz

banner

Pepy Downloads Pepy Downloads per week Pepy Downloads per month standard-readme compliant Python Versions PyPI Version PyPI License

Automatically Visualize any dataset, any size with a single line of code. Now save as HTML files!

AutoViz performs automatic visualization of any dataset with one line. Give it any input file (CSV, txt or json format) and AutoViz will visualize it.

AutoViz can now create charts in multiple formats:

  • If chart_format='png' or 'svg' or 'jpg': Matplotlib charts plotted.
    • Can be saved locally or displayed in Jupyter Notebooks.
    • This is the default behavior for AutoViz.
  • If chart_format='bokeh': Bokeh charts are plotted in Jupyter Notebooks.
    • This is the default for AutoViz_Holo.
  • If chart_format='server', dashboards will pop up for each kind of chart on your browser.
  • If chart_format='html', all charts will be silently saved as HTML files in AutoViz_Plots or any directory you specify.

In both server and bokeh cases, all charts are interactive and you can play with them. In the case of html (HTML) too, the charts are interactive.

Table of Contents

Install

Prerequsites

To clone AutoViz, it's better to create a new environment, and install the required dependencies:

To install from PyPi:

conda create -n <your_env_name> python=3.7 anaconda
conda activate <your_env_name> # ON WINDOWS: `source activate <your_env_name>`
pip install autoviz

To install from source:

cd <AutoViz_Destination>
git clone git@github.com:AutoViML/AutoViz.git
# or download and unzip https://github.com/AutoViML/AutoViz/archive/master.zip
conda create -n <your_env_name> python=3.7 anaconda
conda activate <your_env_name> # ON WINDOWS: `source activate <your_env_name>`
cd AutoViz
pip install -r requirements.txt

Usage

Read this Medium article to know how to use AutoViz.

In the AutoViz directory, open a Jupyter Notebook and use this line to instantiate the library

from autoviz.AutoViz_Class import AutoViz_Class

AV = AutoViz_Class()

Load a dataset (any CSV or text file) into a Pandas dataframe or give the name of the path and filename you want to visualize. If you don't have a filename, you can simply assign the filename argument "" (empty string).

Call AutoViz using the filename (or dataframe) along with the separator and the name of the target variable in the input.

filename = ""
sep = ","
dft = AV.AutoViz(
    filename,
    sep=",",
    depVar="",
    dfte=None,
    header=0,
    verbose=0,
    lowess=False,
    chart_format="svg",
    max_rows_analyzed=150000,
    max_cols_analyzed=30,
    save_plot_dir=None
)

AutoViz will do the rest. You will see charts and plots on your screen.

var_charts

AV.AutoViz is the main plotting function in AV.

Notes:

  • AutoViz will visualize any sized file using a statistically valid sample.
  • COMMA is assumed as default separator in file. But you can change it.
  • Assumes first row as header in file but you can change it.
  • verbose option

    • if 0, display minimal information but displays charts on your notebook
    • if 1, print extra information on the notebook and also display charts
    • if 2, will not display any charts, it will simply save them in your local machine under AutoViz_Plots directory
  • chart_format option

    • if 'svg','jpg' or 'png', displays all charts or saves them depending on verbose option.
    • if 'bokeh', plots interactive charts using Bokeh on your Jupyter Notebook
    • if 'server', will display charts on your browser with one chart type in each tab

bokeh_charts

API

Arguments

  • filename - Make sure that you give filename as empty string ("") if there is no filename associated with this data and you want to use a dataframe, then use dfte to give the name of the dataframe. Otherwise, fill in the file name and leave dfte as empty string. Only one of these two is needed to load the data set.
  • sep - this is the separator in the file. It can be comma, semi-colon or tab or any value that you see in your file that separates each column.
  • depVar - target variable in your dataset. You can leave it as empty string if you don't have a target variable in your data.
  • dfte - this is the input dataframe in case you want to load a pandas dataframe to plot charts. In that case, leave filename as an empty string.
  • header - the row number of the header row in your file. If it is the first row, then this must be zero.
  • verbose - it has 3 acceptable values: 0, 1 or 2. With zero, you get all charts but limited info. With 1 you get all charts and more info. With 2, you will not see any charts but they will be quietly generated and save in your local current directory under the AutoViz_Plots directory which will be created. Make sure you delete this folder periodically, otherwise, you will have lots of charts saved here if you used verbose=2 option a lot.
  • lowess - this option is very nice for small datasets where you can see regression lines for each pair of continuous variable against the target variable. Don't use this for large data sets (that is over 100,000 rows)
  • chart_format - this can be SVG, PNG, JPG or 'bokeh', 'server'. You will get charts generated and saved in multiple formats if you used verbose=2 option. The latter options are useful for interactive charts.
  • max_rows_analyzed - limits the max number of rows that is used to display charts. If you have a very large data set with millions of rows, then use this option to limit the amount of time it takes to generate charts. We will take a statistically valid sample.
  • max_cols_analyzed - limits the number of continuous vars that can be analyzed
  • save_plot_dir - directory you want the plots to be saved. Default is None which means it is saved under the current directory under a sub-folder named AutoViz_Plots server_charts

Maintainers

Contributing

See the contributing file!

PRs accepted.

License

Apache License, Version 2.0

DISCLAIMER

This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoviz-0.1.25.tar.gz (47.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autoviz-0.1.25-py3-none-any.whl (51.8 kB view details)

Uploaded Python 3

File details

Details for the file autoviz-0.1.25.tar.gz.

File metadata

  • Download URL: autoviz-0.1.25.tar.gz
  • Upload date:
  • Size: 47.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for autoviz-0.1.25.tar.gz
Algorithm Hash digest
SHA256 6c0cc4f443731da74bb0bb71d30a777b20bd19f2851585e63e7560428cfbb3a5
MD5 755fbf422e4266a11c476aefd2068f1c
BLAKE2b-256 751289a956e474628437bbab5d8ad91e612b64d385cacbb2559a3a7646d22490

See more details on using hashes here.

File details

Details for the file autoviz-0.1.25-py3-none-any.whl.

File metadata

  • Download URL: autoviz-0.1.25-py3-none-any.whl
  • Upload date:
  • Size: 51.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for autoviz-0.1.25-py3-none-any.whl
Algorithm Hash digest
SHA256 c0b0c5d58702ad6bd8aa502eb46018d6c414427352784104f9a9aad79f49a3b3
MD5 4216896bc805c10cee9d5651ab3cf635
BLAKE2b-256 d19f9e2916fb2ce91d2d2f719f9976ebd038c9fce6f42e3b7e1fd7e65a62aa6d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page