Skip to main content

A library for data summary and analysis from various formats such as CSV, API, URL, etc.

Project description

SheetBuddy

SheetBuddy is a Python library for performing exploratory data analysis (EDA), data summary, and generating comprehensive reports in Excel format. It supports reading data from CSV files, JSON files, and APIs.

PyPI - Daily Downloads PyPI - Version PyPI - License Python Versions

GitHub issues GitHub pull requests GitHub contributors GitHub forks GitHub stars GitHub fork

Features

  • Data Cleaning and Preprocessing
  • Load data from CSV, JSON, and APIs
  • Generate EDA reports in Excel format
  • Summary statistics, null values, standard deviation, and more
  • Column information including descriptions ('May not be available for all columns')
  • Conditional formatting and styling for Excel sheets
  • Summary Statistics
  • Visualization (Correlation Matrix, Basic Mathematics)
  • Data Export (Excel)

New Features in Version 3.1.0 🚀

Outlier Detection and Visualization 📊

  • Feature: Detect outliers in numerical columns using z-score or IQR methods.
  • Implementation: New methods detect_outliers and add_outliers_plot to identify and visualize outliers with boxplots in the EDA sheet.

Enhanced EDA Sheet Visualizations 🖼️

  • Feature: Comprehensive visualizations in the EDA sheet:
    • Histograms for numerical columns.
    • Boxplots for visualizing outliers.
    • Correlation heatmaps for understanding relationships.
  • Implementation: Integrated methods to create these visualizations in the EDA sheet.

Custom Text Headings for Visualizations 📝

  • Feature: Descriptive titles for each visualization section to improve readability.
  • Implementation: add_text_heading method to add custom text headings to each visualization.

Structured Dataset Summary 🗂️

  • Feature: New Dataset Info sheet with a summary of the dataset, including name, format, number of rows and columns, description, and data link.
  • Implementation: add_dataset_info method to create a structured summary of the dataset.

Requirements 📦

To use SheetBuddy, ensure you have the following dependencies:

pandas==1.3.3
requests==2.26.0
openpyxl==3.0.9
tqdm==4.62.3
matplotlib==3.4.3
seaborn==0.11.2
scipy==1.7.1

Note 📝

This library is designed specifically for numerical data analysis. Ensure your datasets are primarily numerical to make the most of SheetBuddy's capabilities.

Enjoy the new features and improvements! 🎉

Python Version Requirements:

  • This version of SheetBuddy requires Python 3.7 or higher.

Upgrade now to leverage these powerful new features and make your data analysis even more insightful! 📈✨

Installation

You can install SheetBuddy using pip:

pip install sheetbuddy

or

pip install sheetbuddy==3.1.0

Check for the lastest version

pip install sheetbuddy --upgrade

Usage

Example 1: Generating an EDA and Datasummary Report from a CSV File.

from sheetbuddy import SheetBuddy 

file_path_or_url = 'https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv'
output_file_name = 'datasummary_report.xlsx'

sb = SheetBuddy(file_path_or_url)
sb.generate_eda_report(output_file_name)

Example 2: Generating an Datasummary & EDA Report from a Local JSON File.

from sheetbuddy import SheetBuddy

file_path = 'path/to/your/data.json'
output_file_name = 'enter_your_desired_name.xlsx'

sb = SheetBuddy(file_path)
sb.generate_eda_report(output_file_name)

Example 3: Generating an Datasummary & EDA Report from a Local CSV File.

from sheetbuddy import SheetBuddy

filename = 'your_local_path.csv'
outputfile = 'enter_your_desired_name.xlsx'

sb = SheetBuddy(filename)
sb.generate_eda_report(outputfile)

How It Works:

1.Data Loading: SheetBuddy loads data from the specified source (CSV, JSON, or API).

2.Data Analysis: It performs various data analyses, including summary statistics, null values analysis, and column descriptions.

3.Report Generation: The results are compiled into an Excel file with conditional formatting and styling for easy interpretation.

Contributing:

Contributions are welcome! If you have any suggestions, bug reports, or feature requests, please open an issue or submit a pull request on GitHub.

License:

SheetBuddy is licensed under the MIT License. See the LICENSE file for more details.

We hope you enjoy these new features and improvements in SheetBuddy v3.1.0 ! 🚀

Back to Top ↑

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sheetbuddy-3.1.0.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

sheetbuddy-3.1.0-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file sheetbuddy-3.1.0.tar.gz.

File metadata

  • Download URL: sheetbuddy-3.1.0.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for sheetbuddy-3.1.0.tar.gz
Algorithm Hash digest
SHA256 b2965414542a56e284b15d2c1e1bd19a872348c1594b51937795c75e174329a1
MD5 6d574548d8f7cbc099bca27cc1c59cf3
BLAKE2b-256 24690a58e2aa9f125d1b1b3b7b51951254e4d966975616708e579b07b296fd95

See more details on using hashes here.

File details

Details for the file sheetbuddy-3.1.0-py3-none-any.whl.

File metadata

  • Download URL: sheetbuddy-3.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for sheetbuddy-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8368d4d153d1e8d8ca16291bc0ca8a7d7e82f1eecae463191eb5939cd4d902b0
MD5 b11ffc0ca0a4a36e1ac024cf11510026
BLAKE2b-256 e157825d94827cf75f020d2af53bb5483c3563a5c4d825359d32ba17797ee1a5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page