Skip to main content

Lightweight helper module to extract data from Google Sheets and Excel files as list of dictionaries

Project description

Google Sheets Helper

A Python ETL driver for reading and transforming Google Sheets and Excel data from Google Drive. Simplifies the process of extracting spreadsheet data and converting it to database-ready Python lists of dictionaries with comprehensive optimization features.

PyPI version Issues Last Commit License

Features

  • Google Sheets & Excel Support: Read Google Sheets and Excel files directly from Google Drive
  • Flexible Column Naming: Choose between snake_case or camelCase column conventions, with robust ASCII normalization and automatic removal of unnamed columns (e.g., from Excel/CSV exports)
  • Progress Logging for Large Downloads: Shows download progress for large Excel files
  • Pure Python Data Cleaning and Transformation: No pandas required; all utilities work with list-of-dictionaries
  • Robust Error Handling: Comprehensive error handling with specific exceptions
  • Type Hints: Full type hint support for better IDE experience

Installation

pip install google-sheets-helper

Quick Start

1. Set up credentials

Place your Google service account credentials in secrets/client_secret.json.

2. Basic usage

from google_sheets_helper import GoogleSheetsHelper, WorksheetUtils, load_client_secret

client_secret = load_client_secret()
gs_helper = GoogleSheetsHelper(client_secret)

spreadsheet_id = "your_spreadsheet_id"
worksheet_name = "your_worksheet_name"

# Load data as list of dictionaries
data = gs_helper.load_sheet_as_dict(spreadsheet_id, worksheet_name)

utils = WorksheetUtils()
data = utils.handle_missing_values(data)
data = utils.clean_text_encoding(data)
data = utils.transform_column_names(data, naming_convention="snake_case")

# Print first row
print(data[0])

# Save to CSV
import csv
with open("output.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=data[0].keys())
    writer.writeheader()
    writer.writerows(data)

Data Cleaning Pipeline

All data cleaning and transformation utilities now work with list-of-dictionaries:

from google_sheets_helper import WorksheetUtils

utils = WorksheetUtils()
data = utils.handle_missing_values(data)
data = utils.clean_text_encoding(data)
data = utils.transform_column_names(data, naming_convention="snake_case")
data = utils.remove_unnamed_and_null_columns(data)

API Reference

  • GoogleSheetsHelper: Main class for reading and transforming Google Sheets/Excel data
  • get_drive_file_metadata: Retrieve file name and MIME type from Google Drive
  • list_files_in_folder: List files in a Google Drive folder
  • load_client_secret: Loads credentials from a JSON file
  • WorksheetUtils: Pure Python utilities for cleaning and transforming list-of-dictionaries
  • Exception classes: AuthenticationError, APIError, ConfigurationError, DataProcessingError, ValidationError

Error Handling

from google_sheets_helper import (
    GoogleSheetsHelper,
    AuthenticationError,
    ValidationError,
    APIError,
    DataProcessingError,
    ConfigurationError
)

try:
    data = gs_helper.load_sheet_as_dict(spreadsheet_id, worksheet_name)
except AuthenticationError:
    # Handle credential issues
    pass
except ValidationError:
    # Handle input validation errors
    pass
except APIError:
    # Handle API errors
    pass
except DataProcessingError:
    # Handle data processing errors
    pass

Examples

Check the examples/ directory for comprehensive usage examples:

  • basic_usage.py - Simple sheet extraction and cleaning

Requirements

  • Python 3.11-3.14
  • gspread >= 6.0.0
  • google-api-python-client >= 2.0.0
  • python-calamine >= 0.4.0

Development

For development installation:

git clone https://github.com/machado000/google-sheets-helper
cd google-sheets-helper
pip install -e ".[dev]"

License

MIT License. See LICENSE file for details.

Support

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google_sheets_helper-2.0.7.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

google_sheets_helper-2.0.7-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file google_sheets_helper-2.0.7.tar.gz.

File metadata

  • Download URL: google_sheets_helper-2.0.7.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.12.3 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for google_sheets_helper-2.0.7.tar.gz
Algorithm Hash digest
SHA256 24973f5ed672f87c08f6f217cf46fbe2734da03ae404f081f6afe55268999f6c
MD5 88ed04afc958b08fb447715695f3b90b
BLAKE2b-256 b51afff526d329bb04e8643bd2ac6557364a0546fbdfc5ce42f625b3839ad5df

See more details on using hashes here.

File details

Details for the file google_sheets_helper-2.0.7-py3-none-any.whl.

File metadata

  • Download URL: google_sheets_helper-2.0.7-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.12.3 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for google_sheets_helper-2.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 8e3083279d44c922819c1bcaa13623224059300dcab4bad61cb464aa2be51369
MD5 84fab599f9ba5c1cde8d460f44b95c52
BLAKE2b-256 0544d3441c7afd31603eec50f5d1d5af0f2e413e76f47f8d099966d26d77b539

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page