Google Sheets Parser
Project description
GSParse - Google Sheets Parser Library
Library for extracting data from Google Sheets by URL. Supports working with multiple worksheets in a single spreadsheet.
Features
- ๐ Working with multiple worksheets in a single spreadsheet
- ๐ Loading data by Google Sheets URL
- ๐ Parsing CSV and XLSX data from buffer
- ๐ฏ Convenient API for working with cells and ranges
- ๐ Data search by value or regular expression
- ๐ Export to various formats
- โก Support for XLSX and CSV formats
- ๐ Automatic retries on loading errors
- ๐ Export data to Python dictionaries
Installation
From GitHub
pip install gsparse
Quick Start
Basic Usage
from gsparse import GSParseClient
# Create client
client = GSParseClient()
# Load spreadsheet by URL (default format is XLSX)
url = "https://docs.google.com/spreadsheets/d/YOUR_SHEET_ID/edit"
spreadsheet = client.load_spreadsheet(url)
# Get first worksheet
worksheet = spreadsheet.get_first_worksheet()
# Work with cells
cell = worksheet.get_cell(1, 1) # A1
print(f"Cell A1 value: {cell.value}")
# Get range
range_obj = worksheet.get_range(1, 10, 1, 3) # A1:C10
cells = worksheet.get_cells_in_range(range_obj)
# Export to dictionary
data_dict = worksheet.get_data_as_dict()
for row in data_dict:
print(row)
Working with Different Formats
# Load spreadsheet in XLSX format (default)
spreadsheet = client.load_spreadsheet(url, format_type="xlsx")
# Load spreadsheet in CSV format
spreadsheet = client.load_spreadsheet(url, format_type="csv")
# Load from CSV string
csv_data = """Name,Age,City
John,25,Moscow
Mary,30,St. Petersburg"""
worksheet = client.load_from_csv_string(csv_data, "My Data")
# Export to dictionary
data_dict = worksheet.get_data_as_dict()
for row in data_dict:
print(row)
Data Search
# Search by value
found_cells = client.find_data(url, "Moscow")
# Search by regular expression
pattern_cells = client.find_by_pattern(url, r"^\d+$") # numbers only
API Reference
GSParseClient
Main class for working with the library.
Methods
load_spreadsheet(url, format_type="xlsx")- Loads entire spreadsheetload_worksheet(url, worksheet_name, format_type="xlsx")- Loads specific worksheetload_from_csv_string(csv_string, worksheet_name)- Loads from CSV stringfind_data(url, value)- Search cells by valuefind_by_pattern(url, pattern)- Search by regular expression
Spreadsheet
Represents Google Sheets table with multiple worksheets.
Properties
title- Spreadsheet titleworksheets- List of worksheetsworksheet_count- Number of worksheetsworksheet_names- List of all worksheet names
Methods
get_worksheet(name)- Get worksheet by nameget_worksheet_by_index(index)- Get worksheet by indexget_first_worksheet()- Get first worksheetexport_to_dict(headers_row)- Export all worksheets to dictionary
Worksheet
Represents worksheet in spreadsheet.
Properties
name- Worksheet namerow_count- Number of rowscolumn_count- Number of columns
Methods
get_cell(row, column)- Get cellget_range(start_row, end_row, start_column, end_column)- Get rangeget_cells_in_range(range_obj)- Get all cells in rangeget_data_as_dict(headers_row)- Export to dictionaryfind_cells_by_value(value)- Search cells by value
Cell
Represents cell in spreadsheet.
Properties
row- Row numbercolumn- Column numbervalue- Cell valueaddress- Cell address (A1, B2, etc.)is_empty- Whether cell is empty
Project Structure
src/gsparse/
โโโ __init__.py # Main module
โโโ client.py # Main client
โโโ core/ # Core entities
โ โโโ cell.py # Cell
โ โโโ range.py # Cell range
โ โโโ worksheet.py # Worksheet
โ โโโ spreadsheet.py # Spreadsheet
โโโ downloaders/ # Downloaders
โ โโโ google_sheets_downloader.py
โโโ parsers/ # Parsers
โ โโโ base_parser.py # Base parser
โ โโโ csv_parser.py # CSV parser
โ โโโ xlsx_parser.py # XLSX parser
โโโ utils/ # Utilities
โโโ url_utils.py # URL handling
โโโ data_utils.py # Data processing
Requirements
- Python >= 3.10
- requests >= 2.31.0
- chardet >= 5.2.0
- openpyxl >= 3.1.2
Examples
The library includes comprehensive examples and tests:
-
Tests: Located in
tests/directorytest_client.py- Tests for GSParseClienttest_core.py- Tests for core entities (Cell, Range, Worksheet, Spreadsheet)
-
Example Usage: See the Quick Start section above for common use cases
Development
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Code check
ruff check src/
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Support
If you have any questions or issues, please open an issue on GitHub.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gsparse-0.2.1.tar.gz.
File metadata
- Download URL: gsparse-0.2.1.tar.gz
- Upload date:
- Size: 21.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e800612a6473ad115c1928e90e0ee58f7baf86badc8d5bfecb0620187836c675
|
|
| MD5 |
b134d11f4f181c96c45a6f279a634e52
|
|
| BLAKE2b-256 |
981715b129a9648fcdd03cfe1275ca2f77e05ad9e4b0e711a7b6fdf96498f0e9
|
File details
Details for the file gsparse-0.2.1-py3-none-any.whl.
File metadata
- Download URL: gsparse-0.2.1-py3-none-any.whl
- Upload date:
- Size: 23.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
890bb8333aea9d0e04c9df91123069bc3d2ac75ca05d9c17b4a53390fc55307b
|
|
| MD5 |
c85fcb636a024063c66ea6a0c4418c43
|
|
| BLAKE2b-256 |
32e78a796b979bbdf19603edbaf98d34722fe88015d73235f04d0df22db8f06a
|