Lightweight helper module to extract data from Google Sheets and Excel files as list of dictionaries
Project description
Google Sheets Helper
A Python ETL driver for reading and transforming Google Sheets and Excel data from Google Drive. Simplifies the process of extracting spreadsheet data and converting it to database-ready Python lists of dictionaries with comprehensive optimization features.
Features
- Google Sheets & Excel Support: Read Google Sheets and Excel files directly from Google Drive
- Flexible Column Naming: Choose between snake_case or camelCase column conventions, with robust ASCII normalization and automatic removal of unnamed columns (e.g., from Excel/CSV exports)
- Progress Logging for Large Downloads: Shows download progress for large Excel files
- Pure Python Data Cleaning and Transformation: No pandas required; all utilities work with list-of-dictionaries
- Robust Error Handling: Comprehensive error handling with specific exceptions
- Type Hints: Full type hint support for better IDE experience
Installation
pip install google-sheets-helper
Quick Start
1. Set up credentials
Place your Google service account credentials in secrets/client_secret.json.
2. Basic usage
from google_sheets_helper import GoogleSheetsHelper, WorksheetUtils, load_client_secret
client_secret = load_client_secret()
gs_helper = GoogleSheetsHelper(client_secret)
spreadsheet_id = "your_spreadsheet_id"
worksheet_name = "your_worksheet_name"
# Load data as list of dictionaries
data = gs_helper.load_sheet_as_json(spreadsheet_id, worksheet_name)
utils = WorksheetUtils()
data = utils.handle_missing_values(data)
data = utils.clean_text_encoding(data)
data = utils.transform_column_names(data, naming_convention="snake_case")
# Print first row
print(data[0])
# Save to CSV
import csv
with open("output.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=data[0].keys())
writer.writeheader()
writer.writerows(data)
Data Cleaning Pipeline
All data cleaning and transformation utilities now work with list-of-dictionaries:
from google_sheets_helper import WorksheetUtils
utils = WorksheetUtils()
data = utils.handle_missing_values(data)
data = utils.clean_text_encoding(data)
data = utils.transform_column_names(data, naming_convention="snake_case")
data = utils.remove_unnamed_and_null_columns(data)
API Reference
GoogleSheetsHelper: Main class for reading and transforming Google Sheets/Excel dataget_drive_file_metadata: Retrieve file name and MIME type from Google Drivelist_files_in_folder: List files in a Google Drive folderload_client_secret: Loads credentials from a JSON fileWorksheetUtils: Pure Python utilities for cleaning and transforming list-of-dictionaries- Exception classes:
AuthenticationError,APIError,ConfigurationError,DataProcessingError,ValidationError
Error Handling
from google_sheets_helper import (
GoogleSheetsHelper,
AuthenticationError,
ValidationError,
APIError,
DataProcessingError,
ConfigurationError
)
try:
data = gs_helper.load_sheet_as_json(spreadsheet_id, worksheet_name)
except AuthenticationError:
# Handle credential issues
pass
except ValidationError:
# Handle input validation errors
pass
except APIError:
# Handle API errors
pass
except DataProcessingError:
# Handle data processing errors
pass
Examples
Check the examples/ directory for comprehensive usage examples:
basic_usage.py- Simple sheet extraction and cleaning
Requirements
- Python 3.10-3.12
- gspread >= 6.0.0
- google-api-python-client >= 2.0.0
- python-calamine >= 0.4.0
Development
For development installation:
git clone https://github.com/machado000/google-sheets-helper
cd google-sheets-helper
pip install -e ".[dev]"
License
MIT License. See LICENSE file for details.
Support
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file google_sheets_helper-2.0.0.tar.gz.
File metadata
- Download URL: google_sheets_helper-2.0.0.tar.gz
- Upload date:
- Size: 11.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.12.3 Linux/5.15.167.4-microsoft-standard-WSL2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
743a0a74a2f592e056f32b098f45a775d94320b1106ece59e396eaab546aec80
|
|
| MD5 |
309d90db70c1c7760200befedaddda52
|
|
| BLAKE2b-256 |
a2847478b0b9f7a993d858eda13835c7146c5b15d188f5e0cbda8052e9c31569
|
File details
Details for the file google_sheets_helper-2.0.0-py3-none-any.whl.
File metadata
- Download URL: google_sheets_helper-2.0.0-py3-none-any.whl
- Upload date:
- Size: 11.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.12.3 Linux/5.15.167.4-microsoft-standard-WSL2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bbd32fedf671125ab4798d15f7feddd3e48fd10e1655d1363c891399626da24
|
|
| MD5 |
e5929d9b399c995ebcea3797d3a7238c
|
|
| BLAKE2b-256 |
2c2d3e3ff4c8777b28ed0b36998c00829162fcb630fcbad3585c7981facadfb9
|