Skip to main content

Google Sheets Parser

Project description

GSParse - Google Sheets Parser Library

PythonVersion

Library for extracting data from Google Sheets by URL. Supports working with multiple worksheets in a single spreadsheet.

Features

  • ๐Ÿ“Š Working with multiple worksheets in a single spreadsheet
  • ๐Ÿ”— Loading data by Google Sheets URL
  • ๐Ÿ“ Parsing CSV and XLSX data from buffer
  • ๐ŸŽฏ Convenient API for working with cells and ranges
  • ๐Ÿ” Data search by value or regular expression
  • ๐Ÿ“‹ Export to various formats
  • โšก Support for XLSX and CSV formats
  • ๐Ÿ”„ Automatic retries on loading errors
  • ๐Ÿ“Š Export data to Python dictionaries

Installation

From GitHub

pip install gsparse

Quick Start

Basic Usage

from gsparse import GSParseClient

# Create client
client = GSParseClient()

# Load spreadsheet by URL (default format is XLSX)
url = "https://docs.google.com/spreadsheets/d/YOUR_SHEET_ID/edit"
spreadsheet = client.load_spreadsheet(url)

# Get first worksheet
worksheet = spreadsheet.get_first_worksheet()

# Work with cells
cell = worksheet.get_cell(1, 1)  # A1
print(f"Cell A1 value: {cell.value}")

# Get range
range_obj = worksheet.get_range(1, 10, 1, 3)  # A1:C10
cells = worksheet.get_cells_in_range(range_obj)

# Export to dictionary
data_dict = worksheet.get_data_as_dict()
for row in data_dict:
    print(row)

Working with Different Formats

# Load spreadsheet in XLSX format (default)
spreadsheet = client.load_spreadsheet(url, format_type="xlsx")

# Load spreadsheet in CSV format
spreadsheet = client.load_spreadsheet(url, format_type="csv")

# Load from CSV string
csv_data = """Name,Age,City
John,25,Moscow
Mary,30,St. Petersburg"""

worksheet = client.load_from_csv_string(csv_data, "My Data")

# Export to dictionary
data_dict = worksheet.get_data_as_dict()
for row in data_dict:
    print(row)

Data Search

# Search by value
found_cells = client.find_data(url, "Moscow")

# Search by regular expression
pattern_cells = client.find_by_pattern(url, r"^\d+$")  # numbers only

API Reference

GSParseClient

Main class for working with the library.

Methods

  • load_spreadsheet(url, format_type="xlsx") - Loads entire spreadsheet
  • load_worksheet(url, worksheet_name, format_type="xlsx") - Loads specific worksheet
  • load_from_csv_string(csv_string, worksheet_name) - Loads from CSV string
  • find_data(url, value) - Search cells by value
  • find_by_pattern(url, pattern) - Search by regular expression

Spreadsheet

Represents Google Sheets table with multiple worksheets.

Properties

  • title - Spreadsheet title
  • worksheets - List of worksheets
  • worksheet_count - Number of worksheets
  • worksheet_names - List of all worksheet names

Methods

  • get_worksheet(name) - Get worksheet by name
  • get_worksheet_by_index(index) - Get worksheet by index
  • get_first_worksheet() - Get first worksheet
  • export_to_dict(headers_row) - Export all worksheets to dictionary

Worksheet

Represents worksheet in spreadsheet.

Properties

  • name - Worksheet name
  • row_count - Number of rows
  • column_count - Number of columns

Methods

  • get_cell(row, column) - Get cell
  • get_range(start_row, end_row, start_column, end_column) - Get range
  • get_cells_in_range(range_obj) - Get all cells in range
  • get_data_as_dict(headers_row) - Export to dictionary
  • find_cells_by_value(value) - Search cells by value

Cell

Represents cell in spreadsheet.

Properties

  • row - Row number
  • column - Column number
  • value - Cell value
  • address - Cell address (A1, B2, etc.)
  • is_empty - Whether cell is empty

Project Structure

src/gsparse/
โ”œโ”€โ”€ __init__.py              # Main module
โ”œโ”€โ”€ client.py                # Main client
โ”œโ”€โ”€ core/                    # Core entities
โ”‚   โ”œโ”€โ”€ cell.py             # Cell
โ”‚   โ”œโ”€โ”€ range.py            # Cell range
โ”‚   โ”œโ”€โ”€ worksheet.py        # Worksheet
โ”‚   โ””โ”€โ”€ spreadsheet.py      # Spreadsheet
โ”œโ”€โ”€ downloaders/            # Downloaders
โ”‚   โ””โ”€โ”€ google_sheets_downloader.py
โ”œโ”€โ”€ parsers/                # Parsers
โ”‚   โ”œโ”€โ”€ base_parser.py      # Base parser
โ”‚   โ”œโ”€โ”€ csv_parser.py       # CSV parser
โ”‚   โ””โ”€โ”€ xlsx_parser.py      # XLSX parser
โ””โ”€โ”€ utils/                  # Utilities
    โ”œโ”€โ”€ url_utils.py        # URL handling
    โ””โ”€โ”€ data_utils.py       # Data processing

Requirements

  • Python >= 3.10
  • requests >= 2.31.0
  • chardet >= 5.2.0
  • openpyxl >= 3.1.2

Examples

The library includes comprehensive examples and tests:

  • Tests: Located in tests/ directory

    • test_client.py - Tests for GSParseClient
    • test_core.py - Tests for core entities (Cell, Range, Worksheet, Spreadsheet)
  • Example Usage: See the Quick Start section above for common use cases

Development

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Code check
ruff check src/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

If you have any questions or issues, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gsparse-0.2.1.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gsparse-0.2.1-py3-none-any.whl (23.9 kB view details)

Uploaded Python 3

File details

Details for the file gsparse-0.2.1.tar.gz.

File metadata

  • Download URL: gsparse-0.2.1.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.5

File hashes

Hashes for gsparse-0.2.1.tar.gz
Algorithm Hash digest
SHA256 e800612a6473ad115c1928e90e0ee58f7baf86badc8d5bfecb0620187836c675
MD5 b134d11f4f181c96c45a6f279a634e52
BLAKE2b-256 981715b129a9648fcdd03cfe1275ca2f77e05ad9e4b0e711a7b6fdf96498f0e9

See more details on using hashes here.

File details

Details for the file gsparse-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: gsparse-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 23.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.5

File hashes

Hashes for gsparse-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 890bb8333aea9d0e04c9df91123069bc3d2ac75ca05d9c17b4a53390fc55307b
MD5 c85fcb636a024063c66ea6a0c4418c43
BLAKE2b-256 32e78a796b979bbdf19603edbaf98d34722fe88015d73235f04d0df22db8f06a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page