This package should be used to create HTML reports that summarise the results of data analyses.
Project description
html-report
Jump to quickstart or example report.
Description
What is it?
A simple package that empowers data enthusiasts to quickly produce interactive HTML reports, containing tables produced with pandas and plots produced with plotly. The package takes care of the grunt work, thereby allowing you to focus your time on what matters: exploring your data.
Main features
The main features of the package are as follows:
- Summarise your data analysis by outputting interactive HTML reports containing tables and plots.
- Complete abstraction of HTML and CSS syntax is achieved by wrapping dominate.
- Isolate related results from the remainder of the report using section functionality of the package's
HTMLReportclass. See use of section functionality in example report. - Separate related results onto different tabs using the package's
Tabbyclass. See use of Tabby in example report.
Project vision and status
Future development will have two aims:
- Improve ability to draw and share value from data, and
- Improve automatic visual appearance of reports.
Development will prioritise (1), since the aim of the project is to speed up the journey from analysis to reporting (rather than facilitate aesthetically pleasing output). Manual addition of CSS can improve report appearance if desired.
The package has a lot of scope for improvement. It's been shared in a very basic form, with the assumption that the most desired improvements will become clear as usage increases. Early ideas for improvements are as follows:
- Add auto-formatting of DataFrames.
- Allow addition of plots produced with other plotting libraries (e.g. matplotlib) to report.
- Document class naming structure to allow manual CSS choices.
- Implement Collapsy: collapsable button with similar purpose to Tabby.
Motivation for package use?
See below examples of common approaches adopted when sharing results of data analysis (and their shortcomings).
Approach 1: Sharing static data
Do you share static data - for example, screenshots of DataFrames and Figures from a Jupyter notebook's output - via another medium (e.g. Slack)? If so, consider the following:
- Each time your input data or method of analysis changes, you must reproduce the output, screenshot and share the results again.
- Each time you share the output, you must also provide an explanation of the results.
- A chat / email history containing analysis of many versions of the same data can lead to confusion in the future.
Approach 2: Sharing code
Sharing code (that performs analysis) likely means GitHub or GitLab is in use. Whilst this comes with the ability to rollback a data pull function, static data or the method of analysis to a specific point in time, it also has shortcomings:
- Each new colleague who wishes to view the output must clone the repository and execute the analysis, which is sub-optimal. This could be especially time consuming if the data pull (if required) and anaylsis-producing code is computationally expensive.
- Non-technical colleagues may fail to understand the output due to their inability to digest the code, and may be incapable of producing the output altogether.
Suggestion: use html-report!
A suggested workflow when using the package is:
- Create a GitHub or GitLab repo.
- Write Python code to perform analysis.
- Embed use of the package in the project's code so that a new report is produced upon code execution.
This workflow has the following advantages over other approaches:
- Use of plotly reaps all the benefits of plotly's javascript (which facilitates interactive plots).
- Use of pandas reaps all the benefits of pandas' Styler object for formatting (see here).
- Use of markdown reaps all the benefits of markdown style formatting.
- Use of HTML and CSS allows users willing to get their hands dirty with CSS to produce visually appealing reports.
- A set of .html files in a local directory can be easily organsied and distributed, unlike a set of screenshots in a chat history.
- Non-technical colleagues are highly likely to be familiar with .html files.
- Use of version control brings the same advantages as approach 2 (above).
Usage
Installation
The package will be available for installation from PyPI soon.
Understanding the package
The functionality of the package is intentionally simple: you can only add content to your report. This may lead to the following queries:
- How do I view the content of my report? Solution: Render using
save()and view in browser! - How do I delete content from my report? Solution: Comment out / delete a few lines of code and re-run your report building code to create a new instance of
HTMLReport! - How do I change the order of content in my report? Solution: Swap a few lines of code and re-run your report building code to create a new instance of
HTMLReport!
Given the approaches suggested by (2) and (3), it's advisable to separate your report building code and computationally intensive code (e.g. iterating over a large DataFrame), thereby enabling you to create an updated copy of your report fast.
Quickstart
The package defines two classes:
HTMLReport, andTabby.
Quickstart: HTMLReport
The workhorse class that facilitates the production of interactive reports.
Recommendation
It is recommended to separate reports into sections, since it allows readers to digest results faster. A section is added one at a time using add_section. Order of section creation maps to order of display in the report.
add_section(
self,
id: str,
width: Optional[str] = None
) -> None
Parameters:
id: Identifier of section.width: Width of section. Advised to parse width as a percentage (e.g. '80%') of the containing block. See documentation on CSS width property for range of values that can be parsed.
Descriptive content
Descriptive content is added through the functions add_header, add_para and add_markdown.
add_header(
self,
content: str,
size: Optional[int] = None,
sec: Optional[str] = None
) -> None
add_para(
self,
content: str,
sec: Optional[str] = None
) -> None:
add_markdown(
self,
content: str,
sec: Optional[str] = None
) -> None
Common parameters:
content: Paragraph or heading text to be displayed / markdown content to render to HTML.sec: Identifier of the section to add the header / paragraph / markdown content to. If not parsed, content added to main report body.
Additional parameters:
size(add_header): Size of the header - between 1 (largest) and 6 (smallest). Determines the HTML tag used.
The function add_markdown depends on the markdown package. add_markdown(content) calls markdown.markdown(content, output_format="html5") to render HTML style tags from markdown style input. See function's behaviour here.
Analytical content
HTMLReport's workhorse function, which allows analysis to be added to the report, is add.
add(
self,
obj: Union[Tabby, dom_tag, DataFrame, Styler, Figure],
sec: Optional[str] = None,
) -> None
Parameters:
obj: Object to add. Add instance of pandas DataFrame, pandas Styler, plotly Figure or Tabby to report. Can add instance of dominate dom_tag if you wish to interact with dominate package.sec: Identifier of the section to add object to. If not parsed, content added to main report body.
Output
The report can be rendered using save() or to_html().
to_html(
self
) -> str
save(
self,
filepath: Union[str, Path] = "unnamed_report.html",
open_browser: bool = True,
) -> None
Parameters:
filepath: Location to save report in.open_browser: If True, automatically open in browser on save.
Quickstart: Tabby
Supplementary class to HTMLReport. Allows descriptive and analytical content to be displayed on different tabs, thereby making the content within the report more digestable.
When adding content, a key must be parsed to specify which tab the content should be added to. This is done through the parameter key, which is considered similar to the parameter sec found in HTMLReport's methods. Unlike sec, key is not optional.
When adding content to a tab:
- if a previously seen value of
keyis parsed, content is appended to tab, or - if a previously unseen value of
keyis parsed, a new tab is automatically created and content is appended to new tab.
Order of tab creation maps to order of display when rendered. If specific order of keys is desired, then keys: Optional[List[Any]] = None can be parsed to Tabby constructor. Whether created in constructor method or a content addition method, each key will be calculated as str(key). Hence, it is recommended to always parse a key as a string.
Descriptive content
Added in similar manner as HTMLReport with functions add_header, add_para and add_markdown.
add_header(
self,
key: Any,
content: str,
size: int = 1
) -> None
add_para(
self,
key: Any,
content: str
) -> None
add_markdown(
self,
key: Any,
content: str
) -> None
Common parameters:
key: Key of tab to add content to. Key will be calculated asstr(key), hence, it is recommended to parse a string to method.content: Paragraph or heading text to be displayed / markdown content to render to HTML.
Additional parameters:
size(add_header): Size of the header - between 1 (largest) and 6 (smallest). Determines the HTML tag used.
Analytical content
Similarly to HTMLReport, Tabby's workhorse function is add.
add(
self,
key: Any,
obj: Union[dom_tag, DataFrame, Styler, Figure],
) -> None
Parameters:
key: Key of tab to add content to. Key will be calculated asstr(key), hence, it is recommended to parse a string to method.obj: Object to add. Add instance of pandas DataFrame, pandas Styler or plotly Figure to tab. Can add instance of dominate dom_tag if you wish to interact with dominate package.
Usage example
Download the example here and open in browser to see the rendered version of the report.
from pathlib import Path
import plotly.express as px
import seaborn as sns
from htmlreport import HTMLReport, Tabby
iris = sns.load_dataset("iris")
# section 1: prepare data
data_summary = iris.head(5).style.format(precision=2)
data_summary.set_table_styles(
[
{"selector": "th.col_heading", "props": "text-align: center;"},
{"selector": "th.col_heading.level0", "props": "font-size: 1.5em;"},
{"selector": "td", "props": "text-align: center; font-weight: bold;"},
],
overwrite=False,
)
data_summary.set_caption("The iris dataset").set_table_styles(
[
{
"selector": "caption",
"props": "caption-side: bottom; text-align: center; font-size:1.25em;",
}
],
overwrite=False,
)
keys = [str(ele) for ele in iris["species"].unique()]
tab_spec = Tabby(keys=keys)
var_to_display = ["sepal_length", "sepal_width", "petal_length", "petal_width"]
for spec in keys:
mask = iris["species"] == spec
to_plot = iris.loc[mask, var_to_display]
tab_spec.add(key=spec, obj=px.box(to_plot))
tab_spec.add_para(
key=spec,
content="""Plots produced with Plotly reap all the benefits of Plotly's javascript. For example, check the
responsiveness of plots by resizing your window!""",
)
# section 2: produce report
rep = HTMLReport(
title="html-report: Example Report",
default_header_size=3,
default_section_width="70%",
)
rep.add_section(id="summ")
rep.add_header(content="Data Overview", sec="summ")
rep.add_markdown(
content="""Use of `add_section()` creates border around content later added to section.
The heading above, paragraph below and data below are added using `add_header()`, `add_para()` and `add()`,
respectively.""",
sec="summ",
)
rep.add_para(
content="The first 5 rows of the data to analyse:",
sec="summ",
)
rep.add(data_summary, sec="summ")
rep.add_section(id="spec")
rep.add_header(content="Analysis of Species", sec="spec")
rep.add_markdown(
content="Line breaks and emphasis in the descriptive content below is achieved with markdown style input.",
sec="spec",
)
rep.add_markdown(
content=f"""There are 3 species to analyse:
<em>{keys[0]}</em>
<em>{keys[1]}</em>
<em>{keys[2]}</em>""",
sec="spec",
)
rep.add_para(
content="See use of Tabby below:",
sec="spec",
)
rep.add(obj=tab_spec, sec="spec")
output = rep.to_html() # output as str
rep.save(
filepath=Path(__file__).parent / "report.html", open_browser=True
) # output as file
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file html_data_report-0.2.0.tar.gz.
File metadata
- Download URL: html_data_report-0.2.0.tar.gz
- Upload date:
- Size: 24.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.9.20 Linux/6.5.0-1025-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd4a57b72eab6cb223c51d280178571cc29a4e652411af4a6a952f30473a8da0
|
|
| MD5 |
f23a1e7edf865112ba8c82ba94b7761e
|
|
| BLAKE2b-256 |
b5dc7fa4f66d8efde08e7dabef57f46092ddf18404afb56ce18bdf83a5491db6
|
File details
Details for the file html_data_report-0.2.0-py3-none-any.whl.
File metadata
- Download URL: html_data_report-0.2.0-py3-none-any.whl
- Upload date:
- Size: 23.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.9.20 Linux/6.5.0-1025-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3640dde251e669488a144160d738eb8cc4380407530a3b9b06dc0ccdfbf494cc
|
|
| MD5 |
c20d4b85727ca0f181c1864a3fddb689
|
|
| BLAKE2b-256 |
bc9537f96ab479702177ec5c18591ece4e3ed33447a5bfd0ab5c3765d2ae3604
|