A simple program to automate exploratory data analysis and reporting.

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.8

Project description

`eda-report` - Automated Exploratory Data Analysis

A Python program to help automate the exploratory data analysis and reporting process.

Input data is analyzed using pandas and SciPy. Graphs are plotted using matplotlib. The results are then nicely packaged as a Word (.docx) document using python-docx.

Installation

You can install the package from PyPI using:

pip install eda-report

Basic Usage

1. Graphical User Interface

The eda-report command launches a graphical window to help select and analyze a csv/excel file:

eda-report

screencast of the gui

You will be prompted to set a report title, group-by variable (optional), graph color and output filename, after which the contents of the input file will be analyzed, and the results will be saved in a Word (.docx) document.

NOTE: For help with Tk - related issues, consider visiting TkDocs.

2. Command Line Interface

To analyze a file named input.csv, just supply its path to the eda-report command:

eda-report -i input.csv

Or even:

eda-report -i input.csv -o output.docx -c cyan --title 'EDA Report'

For more details on the optional arguments, pass the -h or --help flag to view the help message:

eda-report -h

usage: eda-report [-h] [-i INFILE] [-o OUTFILE] [-t TITLE] [-c COLOR]
                  [-g GROUPBY]

Automatically analyze data and generate reports. A graphical user interface
will be launched if none of the optional arguments is specified.

optional arguments:
  -h, --help            show this help message and exit
  -i INFILE, --infile INFILE
                        A .csv or .xlsx file to analyze.
  -o OUTFILE, --outfile OUTFILE
                        The output name for analysis results (default: eda-
                        report.docx)
  -t TITLE, --title TITLE
                        The top level heading for the report (default:
                        Exploratory Data Analysis Report)
  -c COLOR, --color COLOR
                        The color to apply to graphs (default: cyan)
  -g GROUPBY, -T GROUPBY, --groupby GROUPBY, --target GROUPBY
                        The variable to use for grouping plotted values. An
                        integer value is treated as a column index, whereas a
                        string is treated as a column label.

3. Interactive Mode

3.1 Analyze data

>>> eda_report.summarize(iris_data)
                        OVERVIEW
                        ========
Numeric features: sepal_length, sepal_width, petal_length, petal_width
Categorical features: species

          Summary Statistics (Numeric features)
          -------------------------------------
              count    mean     std  min  25%   50%  75%  max  skewness  kurtosis
sepal_length  150.0  5.8433  0.8281  4.3  5.1  5.80  6.4  7.9    0.3149   -0.5521
sepal_width   150.0  3.0573  0.4359  2.0  2.8  3.00  3.3  4.4    0.3190    0.2282
petal_length  150.0  3.7580  1.7653  1.0  1.6  4.35  5.1  6.9   -0.2749   -1.4021
petal_width   150.0  1.1993  0.7622  0.1  0.3  1.30  1.8  2.5   -0.1030   -1.3406

          Summary Statistics (Categorical features)
          -----------------------------------------
        count unique     top freq relative freq
species   150      3  setosa   50        33.33%

          Pearson's Correlation (Top 20)
          ------------------------------
petal_length & petal_width --> very strong positive correlation (0.96)
sepal_length & petal_length --> very strong positive correlation (0.87)
sepal_length & petal_width --> very strong positive correlation (0.82)
sepal_width & petal_length --> moderate negative correlation (-0.43)
sepal_width & petal_width --> weak negative correlation (-0.37)
sepal_length & sepal_width --> very weak negative correlation (-0.12)

3.2 Plot statistical graphs

>>> fig = ep.regression_plot(mpg_data["acceleration"], mpg_data["horsepower"],
...                          labels=("Acceleration", "Horsepower"))
>>> fig.savefig("regression-plot.png")

3.3 Generate a report

>>> eda_report.get_word_report(iris_data)
Analyze variables:  100%|███████████████████████████████████| 5/5
Plot variables:     100%|███████████████████████████████████| 5/5
Bivariate analysis: 100%|███████████████████████████████████| 6/6 pairs.
[INFO 16:14:53.648] Done. Results saved as 'eda-report.docx'
<eda_report.document.ReportDocument object at 0x7f196753bd60>

Visit the official documentation for more info.

Project details

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.8

Release history Release notifications | RSS feed

2.8.1

Aug 19, 2023

2.8.0

Apr 26, 2023

This version

2.7.3

Dec 5, 2022

2.7.2

Oct 31, 2022

2.7.1

Oct 24, 2022

2.7.0

Oct 6, 2022

2.6.0

Jul 12, 2022

2.5.1

May 29, 2022

2.5.0

Apr 22, 2022

2.4.1

Mar 16, 2022

2.4.0

Mar 8, 2022

2.3.1

Jan 25, 2022

2.3.0

Jan 18, 2022

2.2.4

Dec 14, 2021

2.2.3

Nov 7, 2021

2.2.2

Oct 5, 2021

2.2.1

Sep 20, 2021

2.2.0

Sep 10, 2021

2.1.0

Aug 20, 2021

2.0.0

Jul 28, 2021

2.0.0rc0 pre-release

Jul 27, 2021

1.6.2

Jun 29, 2021

1.6.1

Jun 22, 2021

1.6.0

Jun 14, 2021

1.5.0

Jun 8, 2021

1.4.0

Jun 4, 2021

1.4.0rc0 pre-release

Jun 4, 2021

1.4.0b0 pre-release

Jun 3, 2021

1.3.2

May 16, 2021

1.3.2rc0 pre-release

May 16, 2021

1.3.1

Apr 26, 2021

1.3.1rc0 pre-release

Apr 25, 2021

1.3.0

Apr 24, 2021

1.3.0rc0 pre-release

Apr 24, 2021

1.3.0b0 pre-release

Apr 24, 2021

1.3.0a0 pre-release

Apr 24, 2021

1.2.0

Apr 2, 2021

1.2.0rc1 pre-release

Apr 2, 2021

1.2.0b1 pre-release

Apr 2, 2021

1.2.0b0 pre-release

Apr 2, 2021

1.1.3

Mar 28, 2021

1.1.3a1 pre-release

Mar 28, 2021

1.1.2

Mar 25, 2021

1.1.2rc1 pre-release

Mar 25, 2021

1.1.2rc0 pre-release

Mar 25, 2021

1.1.1

Mar 22, 2021

1.1.0

Mar 12, 2021

1.0.0

Mar 11, 2021

0.0.6

Mar 9, 2021

0.0.6b0 pre-release

Mar 9, 2021

0.0.6a0 pre-release

Mar 9, 2021

0.0.5

Mar 7, 2021

0.0.5a0 pre-release

Mar 7, 2021

0.0.4

Mar 3, 2021

0.0.3

Feb 28, 2021

0.0.2

Feb 24, 2021

0.0.1

Feb 24, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eda_report-2.7.3.tar.gz (42.8 kB view hashes)

Uploaded Dec 5, 2022 Source

Built Distribution

eda_report-2.7.3-py3-none-any.whl (43.2 kB view hashes)

Uploaded Dec 5, 2022 Python 3

Hashes for eda_report-2.7.3.tar.gz

Hashes for eda_report-2.7.3.tar.gz
Algorithm	Hash digest
SHA256	`11a4c2b28b952d73a513c539dd78fad0d57488d07bcf4fc49e13868c6dc8cc91`
MD5	`c55be94488c1f355a35e071ce3cdd910`
BLAKE2b-256	`2a0cdbd6bb934ebdfec8f8d5b18fb28dbd1817c7eba5c78558fd5290f2c8d4c4`

Hashes for eda_report-2.7.3-py3-none-any.whl

Hashes for eda_report-2.7.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8aa0b673c40d331993135216af2d58e3d75f24ee2d4d12bb836eae65e0d8087d`
MD5	`bb5738d6f54550816fe500ef05639981`
BLAKE2b-256	`adb4cae293a9152e9af296d8ba70a3f1f913dfddbd494f0a54b55bc9eece1e91`

eda-report 2.7.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

`eda-report` - Automated Exploratory Data Analysis

Installation

Basic Usage

1. Graphical User Interface

2. Command Line Interface

3. Interactive Mode

3.1 Analyze data

3.2 Plot statistical graphs

3.3 Generate a report

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

eda-report 2.7.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

eda-report - Automated Exploratory Data Analysis

Installation

Basic Usage

1. Graphical User Interface

2. Command Line Interface

3. Interactive Mode

3.1 Analyze data

3.2 Plot statistical graphs

3.3 Generate a report

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

`eda-report` - Automated Exploratory Data Analysis