A library for data summary and analysis from various formats such as CSV, API, URL, etc.
Project description
SheetBuddy
SheetBuddy is a Python library for performing exploratory data analysis (EDA), data summary, and generating comprehensive reports in Excel format. It supports reading data from CSV files, JSON files, and APIs.
Features
- Data Cleaning and Preprocessing
- Load data from CSV, JSON, and APIs
- Generate EDA reports in Excel format
- Summary statistics, null values, standard deviation, and more
- Column information including descriptions ('May not be available for all columns')
- Conditional formatting and styling for Excel sheets
- Summary Statistics
- Visualization (Correlation Matrix, Basic Mathematics)
- Data Export (Excel)
New Features in SheetBuddy v2.1.0 🎉
We are excited to announce the release of SheetBuddy v2.1.0, which brings several new features and enhancements to improve your data analysis experience:
-
Expanded Column Descriptions 📝:
- Added a comprehensive dictionary for column descriptions.
-
Enhanced Data Ingestion 🚀:
- Improved methods for reading CSV, JSON, and API data.
- Better error handling for smoother data ingestion.
-
Advanced Data Summarization 📊:
- New
get_column_info
method for detailed column information. - Enhanced summary statistics for all data types.
- Methods to calculate null values, percentages, standard deviation, unique values, and most frequent values.
get_basic_math
method for basic calculations (mean, median, mode, range).
- New
-
Improved Excel Formatting and Styling ✨:
- Consistent formatting and styling for all Excel sheets.
- New methods for conditional formatting and adding text headings.
-
Visualization Enhancements 📈:
- New methods for histograms, correlation heatmaps, and bar charts in Excel sheets.
-
Comprehensive Dataset Info Sheet 🗂️:
- Summary sheet with dataset name, format, rows, columns, description, and data link.
-
Robust Report Generation 📝:
- Comprehensive EDA report with multiple detailed sheets.
- Improved progress indicators and logging.
Installation
You can install SheetBuddy using pip
:
pip install sheetbuddy
or
pip install sheetbuddy==2.1.0
Check for the lastest version
pip install sheetbuddy --upgrade
Usage
Example 1: Generating an EDA and Datasummary Report from a CSV File.
from sheetbuddy import SheetBuddy
file_path_or_url = 'https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv'
output_file_name = 'datasummary_report.xlsx'
sb = SheetBuddy(file_path_or_url)
sb.generate_eda_report(output_file_name)
Example 2: Generating an Datasummary & EDA Report from a Local JSON File.
from sheetbuddy import SheetBuddy
file_path = 'path/to/your/data.json'
output_file_name = 'enter_your_desired_name.xlsx'
sb = SheetBuddy(file_path)
sb.generate_eda_report(output_file_name)
Example 3: Generating an Datasummary & EDA Report from a Local CSV File.
from sheetbuddy import SheetBuddy
filename = 'your_local_path.csv'
outputfile = 'enter_your_desired_name.xlsx'
sb = SheetBuddy(filename)
sb.generate_eda_report(outputfile)
How It Works:
1.Data Loading: SheetBuddy loads data from the specified source (CSV, JSON, or API).
2.Data Analysis: It performs various data analyses, including summary statistics, null values analysis, and column descriptions.
3.Report Generation: The results are compiled into an Excel file with conditional formatting and styling for easy interpretation.
Contributing:
Contributions are welcome! If you have any suggestions, bug reports, or feature requests, please open an issue or submit a pull request on GitHub.
License:
SheetBuddy is licensed under the MIT License. See the LICENSE file for more details.
We hope you enjoy these new features and improvements in SheetBuddy v2.1.0! 🚀
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sheetbuddy-2.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 72f1b1ff9c9019dc78703f105fa974719eaf7e7f3f1a43dc6718dc7c6f404720 |
|
MD5 | 2a7a1da91bfc4864cedb4c7df8553199 |
|
BLAKE2b-256 | 19fe5a88d0a0b66b125b96ca6160b770030e8ebc9b219b39ab47ce084b190eb8 |