Animated plotting extension for Pandas with Matplotlib
Project description
Pandas_Alive
Animated plotting extension for Pandas with Matplotlib
Pandas_Alive is intended to provide a plotting backend for animated matplotlib charts for Pandas DataFrames, similar to the already existing Visualization feature of Pandas.
With Pandas_Alive, creating stunning, animated visualisations is as easy as calling:
df.plot_animated()
Table of Contents
- Installation
- Usage
- Future Features
- Inspiration
- Requirements
- Documentation
- Contributing
- Changelog
Installation
Install with pip install pandas_alive
Usage
As this package builds upon bar_chart_race
, the example data set is sourced from there.
Must begin with a pandas DataFrame containing 'wide' data where:
- Every row represents a single period of time
- Each column holds the value for a particular category
- The index contains the time component (optional)
The data below is an example of properly formatted data. It shows total deaths from COVID-19 for the highest 20 countries by date.
To produce the above visualisation:
- Check Requirements first to ensure you have the tooling installed!
- Call
plot_animated()
on the DataFrame- Either specify a file name to write to with
df.plot_animated(filename='example.mp4')
or usedf.plot_animated().get_html5_video
to return a HTML5 video
- Either specify a file name to write to with
- Done!
import pandas_alive
covid_df = pandas_alive.load_dataset()
covid_df.plot_animated(filename='examples/example-barh-chart.gif')
Currently Supported Chart Types
Horizontal Bar Chart Races
import pandas as pd
import pandas_alive
elec_df = pd.read_csv("data/Aus_Elec_Gen_1980_2018.csv",index_col=0,parse_dates=[0],thousands=',')
elec_df.fillna(0).plot_animated('examples/example-electricity-generated-australia.gif',period_fmt="%Y",title='Australian Electricity Generation Sources 1980-2018')
import pandas_alive
covid_df = pandas_alive.load_dataset()
def current_total(values):
total = values.sum()
s = f'Total : {int(total)}'
return {'x': .85, 'y': .2, 's': s, 'ha': 'right', 'size': 11}
covid_df.plot_animated(filename='examples/summary-func-example.gif',period_summary_func=current_total)
import pandas as pd
import pandas_alive
elec_df = pd.read_csv("data/Aus_Elec_Gen_1980_2018.csv",index_col=0,parse_dates=[0],thousands=',')
elec_df.fillna(0).plot_animated('examples/fixed-example.gif',period_fmt="%Y",title='Australian Electricity Generation Sources 1980-2018',fixed_max=True,fixed_order=True)
import pandas_alive
covid_df = pandas_alive.load_dataset()
covid_df.plot_animated(filename='examples/perpendicular-example.gif',perpendicular_bar_func='mean')
Vertical Bar Chart Races
import pandas_alive
covid_df = pandas_alive.load_dataset()
covid_df.plot_animated(filename='examples/example-barv-chart.gif',orientation='v')
Line Charts
With as many lines as data columns in the DataFrame.
import pandas_alive
covid_df = pandas_alive.load_dataset()
covid_df.diff().fillna(0).plot_animated(filename='examples/example-line-chart.gif',kind='line',period_label={'x':0.1,'y':0.9})
Bar Charts
Similar to line charts with time as the x-axis
import pandas_alive
covid_df = pandas_alive.load_dataset()
covid_df.sum(axis=1).fillna(0).plot_animated(filename='examples/example-bar-chart.gif',kind='bar',period_label={'x':0.1,'y':0.9})
Scatter Charts
import pandas as pd
import pandas_alive
max_temp_df = pd.read_csv(
"data/Newcastle_Australia_Max_Temps.csv",
parse_dates={"Timestamp": ["Year", "Month", "Day"]},
)
min_temp_df = pd.read_csv(
"data/Newcastle_Australia_Min_Temps.csv",
parse_dates={"Timestamp": ["Year", "Month", "Day"]},
)
merged_temp_df = pd.merge_asof(max_temp_df, min_temp_df, on="Timestamp")
merged_temp_df.index = pd.to_datetime(merged_temp_df["Timestamp"].dt.strftime('%Y/%m/%d'))
keep_columns = ["Minimum temperature (Degree C)", "Maximum temperature (Degree C)"]
merged_temp_df[keep_columns].resample("Y").mean().plot_animated(filename='examples/example-scatter-chart.gif',kind="scatter",title='Max & Min Temperature Newcastle, Australia')
Pie Charts
import pandas_alive
covid_df = pandas_alive.load_dataset()
covid_df.plot_animated(filename='examples/example-pie-chart.gif',kind="pie",rotatelabels=True,period_label={'x':0,'y':0})
Bubble Charts
Bubble charts are generated from a multi-indexed dataframes. Where the index is the time period (optional) and the axes are defined with x_data_label
& y_data_label
which should be passed a string in the level 0 column labels.
See an example multi-indexed dataframe at: https://github.com/JackMcKew/pandas_alive/tree/master/data/multi.csv
import pandas_alive
multi_index_df = pd.read_csv("data/multi.csv", header=[0, 1], index_col=0)
multi_index_df.index = pd.to_datetime(multi_index_df.index,dayfirst=True)
map_chart = multi_index_df.plot_animated(
kind="bubble",
filename="examples/example-bubble-chart.gif",
x_data_label="Longitude",
y_data_label="Latitude",
size_data_label="Cases",
)
GeoSpatial Charts
GeoSpatial charts can now be animated easily using geopandas
!
If using Windows, anaconda is the easiest way to install with all GDAL dependancies.
Must begin with a geopandas
GeoDataFrame containing 'wide' data where:
- Every row represents a single geometry (Point or Polygon).
- The index contains the geometry label (optional)
- Each column represents a single period in time.
These can be easily composed by transposing data compatible with the rest of the charts using
df = df.T
.
GeoSpatial Point Charts
import geopandas
import pandas_alive
import contextily
gdf = geopandas.read_file('data/nsw-covid19-cases-by-postcode.gpkg')
gdf.index = gdf.postcode
gdf = gdf.drop('postcode',axis=1)
map_chart = gdf.plot_animated(filename='examples/example-geo-point-chart.gif',basemap_format={'source':contextily.providers.Stamen.Terrain})
Polygon GeoSpatial Charts
Supports GeoDataFrames containing Polygons!
import geopandas
import pandas_alive
import contextily
gdf = geopandas.read_file('data/italy-covid-region.gpkg')
gdf.index = gdf.region
gdf = gdf.drop('region',axis=1)
map_chart = gdf.plot_animated(filename='examples/example-geo-polygon-chart.gif',basemap_format={'source':contextily.providers.Stamen.Terrain})
Multiple Charts
pandas_alive
supports multiple animated charts in a single visualisation.
- Create a list of all charts to include in animation
- Use
animate_multiple_plots
with afilename
and the list of charts (this will usematplotlib.subplots
) - Done!
import pandas_alive
covid_df = pandas_alive.load_dataset()
animated_line_chart = covid_df.diff().fillna(0).plot_animated(kind='line',period_label=False)
animated_bar_chart = covid_df.plot_animated(n_visible=10)
pandas_alive.animate_multiple_plots('examples/example-bar-and-line-chart.gif',[animated_bar_chart,animated_line_chart])
Urban Population
import pandas_alive
urban_df = pandas_alive.load_dataset("urban_pop")
animated_line_chart = (
urban_df.sum(axis=1)
.pct_change()
.dropna()
.mul(100)
.plot_animated(kind="line", title="Total % Change in Population",period_label=False)
)
animated_bar_chart = urban_df.plot_animated(n_visible=10,title='Top 10 Populous Countries',period_fmt="%Y")
pandas_alive.animate_multiple_plots('examples/example-bar-and-line-urban-chart.gif',[animated_bar_chart,animated_line_chart],title='Urban Population 1977 - 2018',adjust_subplot_top=0.85)
Life Expectancy in G7 Countries
import pandas_alive
import pandas as pd
data_raw = pd.read_csv(
"https://raw.githubusercontent.com/owid/owid-datasets/master/datasets/Long%20run%20life%20expectancy%20-%20Gapminder%2C%20UN/Long%20run%20life%20expectancy%20-%20Gapminder%2C%20UN.csv"
)
list_G7 = [
"Canada",
"France",
"Germany",
"Italy",
"Japan",
"United Kingdom",
"United States",
]
data_raw = data_raw.pivot(
index="Year", columns="Entity", values="Life expectancy (Gapminder, UN)"
)
data = pd.DataFrame()
data["Year"] = data_raw.reset_index()["Year"]
for country in list_G7:
data[country] = data_raw[country].values
data = data.fillna(method="pad")
data = data.fillna(0)
data = data.set_index("Year").loc[1900:].reset_index()
data["Year"] = pd.to_datetime(data.reset_index()["Year"].astype(str))
data = data.set_index("Year")
animated_bar_chart = data.plot_animated(
period_fmt="%Y",perpendicular_bar_func="mean", period_length=200,fixed_max=True
)
animated_line_chart = data.plot_animated(
kind="line", period_fmt="%Y", period_length=200,fixed_max=True
)
pandas_alive.animate_multiple_plots(
"examples/life-expectancy.gif",
plots=[animated_bar_chart, animated_line_chart],
title="Life expectancy in G7 countries up to 2015",
adjust_subplot_left=0.2,
)
HTML 5 Videos
Pandas_Alive
supports rendering HTML5 videos through the use of df.plot_animated().get_html5_video()
. .get_html5_video
saves the animation as an h264 video, encoded in base64 directly into the HTML5 video tag. This respects the rc parameters for the writer as well as the bitrate. This also makes use of the interval to control the speed, and uses the repeat parameter to decide whether to loop.
This is typically used in Jupyter notebooks.
import pandas_alive
from IPython.display import HTML
covid_df = pandas_alive.load_dataset()
animated_html = covid_df.plot_animated().get_html5_video()
HTML(animated_html)
Progress Bars!
Generating animations can take some time, so enable progress bars by installing tqdm with pip install tqdm
and using the keyword enable_progress_bar=True
.
By default Pandas_Alive will create a tqdm
progress bar for the number of frames to animate, and update the progres bar after each frame.
import pandas_alive
covid_df = pandas_alive.load_dataset()
covid_df.plot_animated(enable_progress_bar=True)
Example of TQDM in action:
Future Features
A list of future features that may/may not be developed is:
- Geographic charts (currently using OSM export image, potential geopandas)
- This is currently working at a proof of concept level, stay tuned!
Loading bar support (potential tqdm or alive-progress)- Potentially support writing to GIF in memory with https://github.com/maxhumber/gif
- Support custom figures & axes for multiple plots (eg, gridspec)
Some charts that was built using a development branch of Pandas_Alive is:
Inspiration
The inspiration for this project comes from:
Requirements
If you get an error such as TypeError: 'MovieWriterRegistry' object is not an iterator
, this signals there isn't a writer library installed on your machine.
This package utilises the matplotlib.animation function, thus requiring a writer library.
Ensure to have one of the supported tooling software installed prior to use!
Outputting to GIF file type is only supported by ImageMagick
Pillow is not supported currently, please submit a PR if you can make Pillow work!
Documentation
Documentation is provided at https://jackmckew.github.io/pandas_alive/
Contributing
Pull requests are welcome! Please help to cover more and more chart types!
Changelog
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pandas_alive-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c4f2b79e790ca5b4e6e838e1276cd138dfcf10bd997857be145a52b13d91d82 |
|
MD5 | 775be0ef9f1c6bb95f645b3e04072549 |
|
BLAKE2b-256 | a471d6ecbe8834251a4529244b4129be439cf253e10433ff23fd7eb51ea23642 |