Skip to main content

A collection of helper for table handling and visualization

Project description

pandas-plots

a comprehensive python package for enhanced data visualization and analysis with pandas dataframes. provides a high-level api for creating beautiful tables, plots, venn diagrams, and utility functions with minimal code.

PyPI - Version GitHub last commit GitHub License Python Versions

๐Ÿš€ features

  • table utilities (tbl): enhanced table display and description functions
  • plotting functions (pls): comprehensive visualization tools with plotly
  • venn diagrams (ven): easy-to-create venn diagrams for set analysis
  • helper functions (hlp): utility functions for common data operations
  • constants (const): useful constants like color palettes

๐Ÿ“ฆ installation

# using uv (recommended)
uv add -U pandas-plots

๐Ÿšจ this package relies on installed ungoogled-chromium to make static image conversion work. regular chromium is deprecated

๐Ÿ› ๏ธ usage

from pandas_plots import tbl, pls, ven, hlp, const

# for public examples: load sample dataset from seaborn
import seaborn as sb
df = sb.load_dataset('taxis')

examples

styled table

tbl.pivot_df(
    df[["color", "payment", "fare"]],
    total_mode="sum",
    total_axis="xy",
    data_bar_axis=None,
    total_exclude=True,
    pct_axis="xy",
    precision=0,
    heatmap_axis="xy",
    kpi_mode="rag_abs",
    kpi_rag_list=[1000, 10000],
    swap=True,
    font_size_td=12,
    font_size_th=14,
)

pivot

table description

tbl.describe_df(
    df,
    caption="taxis",
    top_n_uniques=10,
    top_n_chars_in_columns=10,
    top_n_chars_in_index=15,
)

table


upset plot

pls.plot_upset(
    df_upset,
    include_false_subsets=False,
    orientation="horizontal",
)

upset


uml graph

metrics = pls.plot_uml_graph()

uml


set filter

filter = hlp.get_duckdb_filter_n(
    con,
    "from Tumor",
    FILTERS,
    # distinct_metric="z_pat_id",
)
counts: rows
---
n = 3_241_401                                     (100.0%) โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
โ”” [2020-2023.07]:                   n = 2_633_644  (81.3%) โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
โ”” [not z_is_dco]:                   n = 2_547_636  (78.6%) โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
โ”” [keine M1]:                       n = 2_305_215  (71.1%) โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
โ”” [keine Verstorbenen < 180 Tage]:  n = 2_132_064  (65.8%) โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
โ”” [lympho- und mesoendokr. Tumore]:    n = 27_653   (0.9%) โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘

sankey diagram

_ = pls.plot_sankey(width=1000)

sankey


box plot with violin overlay

_ = pls.plot_box(df['fare'], height=400, violin=true)

box


box plot with statistics

_ = pls.plot_boxes_large(df[["dropoff_borough","distance"]])

box

venn diagrams

# show venn diagram for 3 sets
set_a = {'ford','ferrari','mercedes', 'bmw'}
set_b = {'opel','bmw','bentley','audi'}
set_c = {'ferrari','bmw','chrysler','renault','peugeot','fiat'}

_df, _details = ven.show_venn3(
    title="taxis",
    a_set=set_a,
    a_label="cars1",
    b_set=set_b,
    b_label="cars2",
    c_set=set_c,
    c_label="cars3",
    verbose=0,
    size=8,
)

venn


๐Ÿ“š api reference

table utilities (tbl)

function description
show_num_df() displays a table as styled version with additional information
describe_df() alternative version of pandas describe() function
descr_db() short description for a duckdb relation
pivot_df() gets a pivot table of a 3 column dataframe (or 2 columns if no weights are given)
print_summary() shows statistics for a pandas dataframe or series

plotting functions (pls)

function description
plot_box() auto annotated boxplot w/ violin option
plot_boxes() multiple boxplots (annotation is experimental)
plot_stacked_bars() shortcut to stacked bars
plot_bars() standardized bar plot for categorical column with confidence intervals
plot_histogram() histogram for one or more numerical columns
plot_joints() joint plot for exactly two numerical columns
plot_quadrants() quickly shows a 2x2 heatmap
plot_facet_stacked_bars() stacked bars for a facet value as subplots
plot_sankey() generates a sankey diagram
plot_pie() generates a pie chart
plot_box_large() for large datasets using seaborn
plot_boxes_large() for large datasets using seaborn
plot_histogram_large() for large datasets using seaborn
plot_upset() generates an upset plot based on upsetplot
plot_uml_graph() generates a uml graph based on mermaid for structured data

venn diagrams (ven)

function description
show_venn2() displays a venn diagram for 2 sets
show_venn3() displays a venn diagram for 3 sets

helper functions (hlp)

function description
to_series() converts a dataframe to a series
mean_confidence_interval() calculates mean and confidence interval for a series
wrap_text() formats strings or lists to a given width
replace_delimiter_outside_quotes() replaces delimiters only outside of quotes in csv imports
create_barcode_from_url() creates a barcode from a given url
add_datetime_col() adds a datetime column to a dataframe (chainable)
show_package_version() prints version of a list of packages
get_os() helps identify and ensure operating system at runtime
add_bitmask_label() adds a column that resolves a bitmask column into human-readable labels
find_cols() finds all columns in a list that contain any of the given stubs
add_measures_to_pyg_config() adds measures to a pygwalker config file
get_tum_details() prints details of a specific tumor (requires connection to clinical cancer data)
get_duckdb_filter_n() print rowcounts for cascading filters in duckdb with ansi bars
print_filter() print filter as markdown sql codeblock
is_ipynb() detects if code is running in jupyter notebook
prepend_uv_header() prepends uv header to a .py script to make it executable for uv run command
create_py_script() creates a .py script from a .ipynb file
setup_rendering() triggers clean(er) rendering of plots and pandas tables to markdown

โš™๏ธ configuration

environment settings

# * set theme: light / dark
# * this will affect all plots
os.environ['THEME'] = 'light'

# * set renderer: svg / png for static images, '' for interactive plots
# * note: only static images will be rendered in markdown
os.environ['RENDERER'] = 'svg'

๐Ÿงฉ prerequisites

  • python 3.10+: compatible with python versions 3.10 - 3.13
  • uv: uv is recommended for package management

โš ๏ธ this package depends on numpy<2.0.0 since UpSetPlot is still tied to the previous versions

๐Ÿค contributing

contributions are welcome! please feel free to submit a pull request. for major changes, please open an issue first to discuss what you would like to change.

๐Ÿ“„ license

this project is licensed under the MIT license - see the license file for details.

๐Ÿท๏ธ tags

#pandas #visualizations #statistics #data-science #data-analysis #python

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_plots-1.4.6.tar.gz (74.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_plots-1.4.6-py3-none-any.whl (101.0 kB view details)

Uploaded Python 3

File details

Details for the file pandas_plots-1.4.6.tar.gz.

File metadata

  • Download URL: pandas_plots-1.4.6.tar.gz
  • Upload date:
  • Size: 74.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pandas_plots-1.4.6.tar.gz
Algorithm Hash digest
SHA256 1eeaabb6084b76142405dd96a36dce80bf31cbe753cac0519646d242c54a7643
MD5 bcf8628eccbe1d8bc0e745e06f7b1ce1
BLAKE2b-256 f732afd3b4968c9bdaa724a3ecd0c529ab423f35807ef567819748bc700d6657

See more details on using hashes here.

File details

Details for the file pandas_plots-1.4.6-py3-none-any.whl.

File metadata

  • Download URL: pandas_plots-1.4.6-py3-none-any.whl
  • Upload date:
  • Size: 101.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pandas_plots-1.4.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ac9658c3fc7cceb55a728678e073a28d84b585d3e33404a62f4ec9fe31c01e48
MD5 0f1f434cea4d9e56ee3f385b0019d38a
BLAKE2b-256 886f57ef31773539c78e5ce51af24dc168b80dce299d56b196e557bd0a151fd9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page