Skip to main content

A Tabular Helper API library that wraps Google Sheets and Docs with a typed, consistent interface.

Project description

tha-google-runner

CI

A Tabular Helper API library that wraps Google Sheets, Docs, Drive, Slides, and Gmail with a typed, consistent interface.

Install

pip install tha-google-runner

Authentication setup

tha-google-runner uses your personal Google account — not a service account. There are two ways to authenticate. Option 1 is recommended for most users since it works with private files and doesn't require any additional tooling.

Cost note: This package is free and open source. The Google APIs it uses are also free for normal scripting workloads — Google provides a generous free tier that the vast majority of users will never exceed. Google Cloud Console may ask for a credit card when you first create a project to verify your identity, but Google does not charge you for the APIs used here. Any billing questions are between you and Google — not this package.

Option 1 — OAuth2 client secrets (recommended)

This is the standard path for accessing your own private Google files.

Step 1 — Create a Google Cloud project

  1. Go to console.cloud.google.com
  2. Click the project dropdown → New Project → give it any name → Create

Step 2 — Enable the required APIs

In your new project, go to APIs & ServicesEnable APIs and Services and enable:

  • Google Sheets API
  • Google Drive API
  • Google Docs API (only needed if you use ThaDocs)
  • Google Slides API (only needed if you use ThaSlides)
  • Gmail API (only needed if you use ThaGmail)

Step 3 — Create OAuth2 credentials

  1. Go to APIs & ServicesCredentialsCreate CredentialsOAuth 2.0 Client ID
  2. If prompted, configure the OAuth consent screen first:
    • User type: External → fill in app name and your email → save
  3. Application type: Desktop app → give it a name → Create
  4. Click Download JSON and save the file (e.g., client_secrets.json)

Step 4 — Run the auth setup CLI

tha-google-init --credentials client_secrets.json

This opens a browser window for you to grant access. After that, the token is cached at ~/.config/tha-google-runner/token.json and no browser is needed on subsequent runs.

You can also pass the credentials file directly at runtime:

sheets = ThaSheets(credentials_file="client_secrets.json")

Re-authentication note: If you add ThaGmail to an existing setup, the Gmail scope was not included in your previous token. Run tha-google-init again (or pass credentials_file= on first use) to grant the new scope and refresh your cached token.

Option 2 — Application Default Credentials (ADC)

Use this only if you're accessing publicly shared files or working in a Google Cloud environment (e.g., Cloud Run, Vertex AI). ADC via gcloud does not grant access to your personal private Drive/Sheets/Docs/Gmail.

gcloud auth application-default login

A browser window opens, credentials are saved to your machine, and any Tha* class works with no arguments. Requires the Google Cloud SDK.


Quick start

ThaSheets

from tha_google_runner import ThaSheets

sheets = ThaSheets()  # uses ADC; or pass credentials_file="client_secrets.json"

# Read all rows (first row is headers)
rows = sheets.read(spreadsheet_id="your-spreadsheet-id")

# Append new rows (writes headers automatically if the sheet is empty)
sheets.append_rows(
    [{"name": "Alice", "score": 95}, {"name": "Bob", "score": 82}],
    spreadsheet_id="your-spreadsheet-id",
)

# Append using raw lists — header row auto-detected and dropped if it matches the sheet
sheets.append_rows(
    [["name", "score"], ["Alice", 95]],
    spreadsheet_id="your-spreadsheet-id",
)

# Overwrite the entire sheet
sheets.update_rows(
    [{"name": "Alice", "score": 95}],
    spreadsheet_id="your-spreadsheet-id",
)

# Upsert by key — inserts new rows, updates existing ones
sheets.upsert_rows(
    [{"id": "1", "name": "Alice", "score": 99}],
    key="id",
    spreadsheet_id="your-spreadsheet-id",
)

# Create a new spreadsheet and get its ID
spreadsheet_id = sheets.create("My Report", rows=[{"col": "val"}])

# Clear a sheet
sheets.clear(spreadsheet_id="your-spreadsheet-id")

Finding your spreadsheet ID: It's the long string in the URL between /d/ and /edit. https://docs.google.com/spreadsheets/d/<spreadsheet-id>/edit

You can also pass url= instead of spreadsheet_id= to any method and the ID will be extracted automatically.

ThaDocs

from tha_google_runner import ThaDocs

docs = ThaDocs()  # uses ADC; or pass credentials_file="client_secrets.json"

# Read all text in a document
text = docs.read(doc_id="your-document-id")

# Append text to the end of a document
docs.append("\nNew paragraph.", doc_id="your-document-id")

# Insert text immediately after a specific string
docs.insert_after("Appendix", after="See also:", doc_id="your-document-id")

# Replace all occurrences of a string
count = docs.replace(old_text="foo", new_text="bar", doc_id="your-document-id")

Finding your document ID: It's the long string in the URL between /d/ and /edit. https://docs.google.com/document/d/<document-id>/edit

You can also pass url= instead of doc_id= to any method.

ThaDrive

from tha_google_runner import ThaDrive

drive = ThaDrive()  # uses ADC; or pass credentials_file="client_secrets.json"

# List all non-trashed files
files = drive.list_files()

# List files in a specific folder
files = drive.list_files(folder_id="folder-id")

# Search by name (contains by default)
results = drive.search("Q1 Report")

# Search by exact name
results = drive.search("Q1 Report.pdf", exact=True)

# Get file metadata
meta = drive.get(file_id="file-id")

# Export a Google Workspace file (Docs, Sheets, Slides) to a local format
pdf_bytes = drive.export(file_id="file-id", mime_type="application/pdf")

# Download a regular uploaded file (PDFs, images, etc.)
content = drive.download(file_id="file-id")
with open("output.pdf", "wb") as f:
    f.write(content)

You can pass url= instead of file_id= to any method.

ThaSlides

from tha_google_runner import ThaSlides

slides = ThaSlides()  # uses ADC; or pass credentials_file="client_secrets.json"

# Read all slides — returns a list of dicts, one per slide
deck = slides.read(presentation_id="your-presentation-id")
for slide in deck:
    print(slide["index"], slide["title"], slide["body"], slide["notes"])

# Get the raw API response
raw = slides.get(presentation_id="your-presentation-id")

Each dict in the returned list has:

Key Type Description
index int 0-based slide position
object_id str Google's internal slide ID
title str Title placeholder text (empty string if none)
body str Body/subtitle placeholder text (empty string if none)
notes str Speaker notes text (empty string if none)

You can also pass url= instead of presentation_id=.

ThaGmail

from tha_google_runner import ThaGmail

gmail = ThaGmail()  # uses ADC; or pass credentials_file="client_secrets.json"

# Send a plain-text email
gmail.send(to="recipient@example.com", subject="Hello", body="Hi there!")

# Send to multiple recipients with CC and BCC
gmail.send(
    to=["alice@example.com", "bob@example.com"],
    subject="Report",
    body="<h1>Done</h1>",
    cc="manager@example.com",
    html=True,
)

# List messages (returns up to 100 by default)
messages = gmail.list_messages()

# Filter with a Gmail query string
messages = gmail.list_messages(query="from:boss@example.com is:unread", max_results=20)

# Read a message by ID
msg = gmail.read(message_id=messages[0]["id"])
print(msg["subject"], msg["from_"], msg["body"])

Each dict returned by read() has:

Key Type Description
id str Message ID
thread_id str Thread ID
subject str Subject header
from_ str From header
to str To header
date str Date header
body str Plain-text body (falls back to HTML if no plain-text part)

Row input formats

All write methods (append_rows, update_rows, upsert_rows, create, add_sheet) accept either format:

list[dict] — keys are column headers:

[{"name": "Alice", "score": 95}, {"name": "Bob", "score": 82}]

list[list] — raw rows with automatic header detection:

[["name", "score"], ["Alice", 95], ["Bob", 82]]

Header detection for list[list] input:

Sheet state First row matches existing headers? Result
Has data Yes Header row dropped, rest appended as data
Has data No All rows treated as data
Empty / being replaced First row always becomes headers

API

ThaSheets(*, credentials_file=None, token_file=None)

ThaSheets(
    credentials_file: str | None = None,  # path to client_secrets.json; None uses ADC
    token_file: str | None = None,         # override token cache path (OAuth2 only)
)

The Google client is built lazily on first use and cached for the lifetime of the instance. After any write, sheets.rows is set to the data rows that were written (as list[dict]).


read(*, spreadsheet_id=None, url=None, sheet_name=None) -> list[dict]

Read all rows. The first row is treated as headers; each subsequent row becomes a dict.

rows = sheets.read(spreadsheet_id="spreadsheet-id")
rows = sheets.read(url="https://docs.google.com/spreadsheets/d/.../edit")
rows = sheets.read(spreadsheet_id="spreadsheet-id", sheet_name="Q1 Data")

append_rows(rows, *, spreadsheet_id=None, url=None, sheet_name=None) -> int

Append rows to an existing sheet. Returns the number of rows appended.

  • If the sheet is empty, the headers are written first.
  • Missing keys in a row are filled with "".
count = sheets.append_rows(
    [{"name": "Alice", "score": 95}],
    spreadsheet_id="spreadsheet-id",
)

update_rows(rows, *, spreadsheet_id=None, url=None, sheet_name=None) -> int

Overwrite all data in a sheet. Clears the sheet first, then writes headers + rows. Returns the number of rows written. Passing an empty list clears the sheet and returns 0.

count = sheets.update_rows(
    [{"name": "Alice", "score": 95}],
    spreadsheet_id="spreadsheet-id",
)

upsert_rows(rows, *, key, spreadsheet_id=None, url=None, sheet_name=None, on_conflict="update_all") -> int

Insert new rows and update existing ones matched by key. Returns the number of rows upserted.

  • key — column name (str) or list of column names for composite keys
  • New columns in incoming rows are appended to the sheet automatically
  • on_conflict controls what happens when multiple existing rows match the same key:
    • "update_all" (default) — update every matching row
    • "update_first" — update only the first match
    • "update_last" — update only the last match
    • "skip" — leave duplicates untouched
    • "raise" — raise GoogleError
count = sheets.upsert_rows(
    [{"id": "1", "name": "Alice", "score": 99}],
    key="id",
    spreadsheet_id="spreadsheet-id",
)

# Composite key
count = sheets.upsert_rows(rows, key=["year", "month"], spreadsheet_id="spreadsheet-id")

create(title, *, rows=None, sheet_name="Sheet1") -> str

Create a new spreadsheet. Returns the new spreadsheet's ID.

sid = sheets.create("My Report")
sid = sheets.create("My Report", rows=[{"col": "val"}], sheet_name="Data")

delete(*, spreadsheet_id=None, url=None) -> None

Permanently delete a spreadsheet.

sheets.delete(spreadsheet_id="spreadsheet-id")

list_sheets(*, spreadsheet_id=None, url=None) -> list[str]

Return the names of all worksheets in a spreadsheet.

names = sheets.list_sheets(spreadsheet_id="spreadsheet-id")
# ["Sheet1", "Q1 Data", "Archive"]

add_sheet(sheet_name, *, spreadsheet_id=None, url=None, rows=None) -> None

Add a new worksheet to an existing spreadsheet. Optionally write initial rows.

sheets.add_sheet("Q2 Data", spreadsheet_id="spreadsheet-id")
sheets.add_sheet("Q2 Data", spreadsheet_id="spreadsheet-id", rows=[{"col": "val"}])

delete_sheet(sheet_name, *, spreadsheet_id=None, url=None) -> None

Delete a worksheet from a spreadsheet.

sheets.delete_sheet("Archive", spreadsheet_id="spreadsheet-id")

share(email, *, spreadsheet_id=None, url=None, role="reader") -> None

Share a spreadsheet with a user. role can be "reader", "writer", or "owner".

sheets.share("colleague@example.com", spreadsheet_id="spreadsheet-id", role="writer")

clear(*, spreadsheet_id=None, url=None, sheet_name=None) -> None

Clear all data in a sheet. Resets sheets.rows to [].

sheets.clear(spreadsheet_id="spreadsheet-id")
sheets.clear(spreadsheet_id="spreadsheet-id", sheet_name="Archive")

ThaDocs API

ThaDocs(*, credentials_file=None, token_file=None)

ThaDocs(
    credentials_file: str | None = None,  # path to client_secrets.json; None uses ADC
    token_file: str | None = None,         # override token cache path (OAuth2 only)
)

The Google Docs client is built lazily on first use and cached for the lifetime of the instance. After a read(), docs.content is set to the full plain text of the document.


read(*, doc_id=None, url=None) -> str

Read the full plain text of a document. Sets docs.content.

text = docs.read(doc_id="document-id")
text = docs.read(url="https://docs.google.com/document/d/.../edit")

append(text, *, doc_id=None, url=None) -> None

Append text to the end of a document.

docs.append("\nNew section.", doc_id="document-id")

insert_after(text, *, after, doc_id=None, url=None) -> None

Insert text immediately after the first occurrence of after in the document. Raises GoogleError if after is not found.

docs.insert_after(" (updated)", after="Section 2", doc_id="document-id")

replace(*, old_text, new_text, doc_id=None, url=None, match_case=True) -> int

Replace all occurrences of old_text with new_text. Returns the number of replacements made.

count = docs.replace(old_text="draft", new_text="final", doc_id="document-id")
count = docs.replace(old_text="Draft", new_text="Final", doc_id="document-id", match_case=False)

ThaDrive API

ThaDrive(*, credentials_file=None, token_file=None)

ThaDrive(
    credentials_file: str | None = None,
    token_file: str | None = None,
)

list_files(*, folder_id=None, query=None) -> list[dict]

List all non-trashed files. Optionally filter by folder or provide a raw Drive query string.

files = drive.list_files()
files = drive.list_files(folder_id="folder-id")
files = drive.list_files(query="mimeType = 'application/pdf'")

search(name, *, exact=False) -> list[dict]

Search files by name. Uses contains by default; pass exact=True for an exact match.

results = drive.search("Q1 Report")
results = drive.search("Q1 Report.pdf", exact=True)

get(*, file_id=None, url=None) -> dict

Get metadata for a single file.

meta = drive.get(file_id="file-id")

export(*, file_id=None, url=None, mime_type="application/pdf") -> bytes

Export a Google Workspace file (Docs, Sheets, Slides) to a different format. Returns the file content as bytes.

pdf = drive.export(file_id="file-id", mime_type="application/pdf")
xlsx = drive.export(file_id="file-id", mime_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")

Use export for Google Workspace files. Use download for regular uploaded files (PDFs, images, etc.).


download(*, file_id=None, url=None) -> bytes

Download a regular (non-Google Workspace) file stored in Drive. Returns the file content as bytes.

content = drive.download(file_id="file-id")
with open("report.pdf", "wb") as f:
    f.write(content)

ThaSlides API

ThaSlides(*, credentials_file=None, token_file=None)

ThaSlides(
    credentials_file: str | None = None,
    token_file: str | None = None,
)

read(*, presentation_id=None, url=None) -> list[dict]

Read all slides. Returns one dict per slide with index, object_id, title, body, and notes.

deck = slides.read(presentation_id="presentation-id")
for slide in deck:
    print(f"Slide {slide['index']}: {slide['title']}")
    print(slide['body'])
    print(slide['notes'])

get(*, presentation_id=None, url=None) -> dict

Return the raw API response for the presentation.

raw = slides.get(presentation_id="presentation-id")

ThaGmail API

ThaGmail(*, credentials_file=None, token_file=None)

ThaGmail(
    credentials_file: str | None = None,
    token_file: str | None = None,
)

send(*, to, subject, body, cc=None, bcc=None, html=False) -> dict

Send an email. to, cc, and bcc each accept a string or a list of strings.

gmail.send(to="recipient@example.com", subject="Hello", body="Hi there!")
gmail.send(
    to=["alice@example.com", "bob@example.com"],
    subject="Report",
    body="<h1>Done</h1>",
    cc="manager@example.com",
    html=True,
)

Returns the sent message metadata dict from the API.


list_messages(*, query=None, max_results=100) -> list[dict]

List messages. Accepts any Gmail query string. Returns up to max_results messages (capped at 500).

messages = gmail.list_messages()
messages = gmail.list_messages(query="from:boss@example.com is:unread", max_results=20)

Each dict has id and threadId.


read(*, message_id) -> dict

Fetch and parse a message by ID.

msg = gmail.read(message_id="message-id")
print(msg["subject"], msg["from_"], msg["body"])

Returns a dict with id, thread_id, subject, from_, to, date, and body.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tha_google_runner-0.1.4.tar.gz (76.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tha_google_runner-0.1.4-py3-none-any.whl (21.0 kB view details)

Uploaded Python 3

File details

Details for the file tha_google_runner-0.1.4.tar.gz.

File metadata

  • Download URL: tha_google_runner-0.1.4.tar.gz
  • Upload date:
  • Size: 76.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tha_google_runner-0.1.4.tar.gz
Algorithm Hash digest
SHA256 d00f325ce18fb983ee067e71ab10846035726091a7b093f28a03edf5360bb495
MD5 1d007b94e8f5c0db2b7d3706d41b210f
BLAKE2b-256 8ddc50765fa9abea65283fd1e731327070fa31b38dcfb8aa07b31196611aaaca

See more details on using hashes here.

Provenance

The following attestation bundles were made for tha_google_runner-0.1.4.tar.gz:

Publisher: publish.yml on tha-guy-nate/tha-google-runner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tha_google_runner-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for tha_google_runner-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 865b33910b359c1ef202c333e2d98526b7415d97c178c8c7370c1984983e123f
MD5 1b0dbef3d1ed959a66a44940c7cf9a23
BLAKE2b-256 96c6ab57c961523c8621d7f13291e0424fe86fc358c704ec9f229a9acd39f715

See more details on using hashes here.

Provenance

The following attestation bundles were made for tha_google_runner-0.1.4-py3-none-any.whl:

Publisher: publish.yml on tha-guy-nate/tha-google-runner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page