Skip to main content

Python client for the Gaston API (transcription, translation and sentence search).

Project description

Gaston API Client

PyPI version Python versions License: MIT

A small, typed Python client for the Gaston API: transcription, translation and full-text search of sentences within transcribed recordings.

Requires a Gaston account and an API token (see Configuration).

Installation

pip install gaston

Requires Python 3.10+.

For local development from a checkout instead:

pip install -e .

Quick start

from gaston import GastonClient

client = GastonClient(token="gapi-...")

# Who am I + remaining quota
me = client.me()
print(me.email, "files left:", me.usage.files_left)

# Transcribe a local file
result = client.transcribe("interview.mp4", lang="en", title="My interview")
print(result.id, result.state)

# Transcribe from a URL (YouTube or web)
client.transcribe_url("https://youtu.be/dQw4w9WgXcQ", lang="en")

# Translate an existing transcription
client.translate(result.id, target_lang="de")

# Speaker diarization (requires a completed translation in that language)
client.diarize(result.id, lang="de", speakers=2)

# List your media (paginated). Items are Media objects.
page = client.list_media(page=1)
print("total:", page.total, "pages:", page.pages)
for item in page:
    print(item.id, item.title, item.state, item.available_languages)

# Fetch a single media item with its sentences
media = client.get_media(result.id, lang="en")
for sentence in media.sentences:
    print(sentence.id, sentence.text, sentence.speaker)

# Full text search across the whole library
results = client.search("climate change", max_=20)
print("total matches:", results.total)
for hit in results:
    print(hit["_sentence"]["body"], "->", hit["_highlight"]["body"])

See Search for query syntax and filtering options.

Configuration

Generate an API token in the Gaston app under Settings -> API. Full endpoint documentation is available at https://www.gaston.live/en/api.

The token can be supplied directly or via an environment variable:

Argument Environment variable Default
token GASTON_API_TOKEN (required)
# Uses GASTON_API_TOKEN from the environment
with GastonClient() as client:
    ...

Timeouts

Ordinary requests use a 30s timeout. The file upload in transcribe can take minutes for large files, so it uses a separate, more generous upload_timeout (default (10s connect, 600s read)).

A timeout may be a single float, a (connect, read) tuple, or None to wait indefinitely.

# Customise the defaults for all calls
client = GastonClient(
    token="gapi-...",
    timeout=30,
    upload_timeout=(10, 1800),   # allow up to 30 min to upload large files
)

# Or override per call (e.g. no read timeout for a very large file)
client.transcribe("huge-recording.mp4", timeout=(10, None))

Directories

folder = client.create_directory("Podcasts")
client.update_directory(folder.id, title="Podcast archive")
client.move_media(media_id="me...", dir_id=folder.id)
tree = client.directory_tree()
client.delete_directory(folder.id)

Search

client.search(query, from_=0, max_=50, dir_ids=None, lang=None) runs a full-text search over every sentence in your transcribed media.

Query syntax

The query supports a subset of the Lucene query_string syntax:

Feature Example Notes
Boolean AND cats AND dogs both terms must appear
Boolean OR cats OR dogs either term
Boolean NOT cats NOT dogs exclude a term
Grouping (cats OR dogs) AND vet combine operators with parentheses
Exact phrase "climate change" quoted terms match as a phrase
Trailing wildcard transcri* matches transcribe, transcription...

Leading wildcards (*tion), field selectors, fuzzy (~), boosts (^) and ranges are not supported and are stripped server-side. Queries must be at least 3 characters.

results = client.search('(invoice OR receipt) AND "due date" NOT draft')

Filtering and pagination

# Search within a single directory
client.search("budget", dir_ids=[42])

# Search across several directories
client.search("budget", dir_ids=[42, 43, 7])

# Restrict to one language, and page through results
page2 = client.search("budget", from_=50, max_=50, lang="en")

Reading results

search() returns a SearchResults object. Iterate it for hits, or read .total for the overall match count. Each hit is a dict with:

  • _sentence - the matched sentence plus its media metadata (id, title, duration, directory, thumbnail, file, originUrl).
  • _highlight - matched fragments with the hit terms wrapped in <hlt>...</hlt> tags.
results = client.search("climate change", max_=20)
print("total matches:", results.total)
for hit in results:
    sentence = hit["_sentence"]
    print(sentence["media"]["title"], "|", hit["_highlight"]["body"])

Error handling

All failures raise a subclass of GastonError:

from gaston import GastonClient, AuthenticationError, RateLimitError, NotFoundError

try:
    client.transcribe("clip.mp4")
except RateLimitError:
    print("File limit reached")
except AuthenticationError:
    print("Bad token / disabled account")
except NotFoundError as e:
    print("Not found:", e.message)
Exception Trigger
AuthenticationError HTTP 403, invalid token / disabled user
BadRequestError HTTP 400, invalid parameters
NotFoundError HTTP 404, resource not found
RateLimitError HTTP 429, usage limit exceeded
GastonAPIError any other API error

Every exception carries .status_code, .message, .details and the raw .payload.

Supported languages

from gaston import SUPPORTED_LANGUAGES, TRANSLATION_LANGUAGES

SUPPORTED_LANGUAGES lists transcription source languages; TRANSLATION_LANGUAGES lists the available translation targets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gaston-0.4.0.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gaston-0.4.0-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file gaston-0.4.0.tar.gz.

File metadata

  • Download URL: gaston-0.4.0.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for gaston-0.4.0.tar.gz
Algorithm Hash digest
SHA256 8c8adcfb794457ebb1716a5d90968a73a778dda5409377fdc38ff79814d634db
MD5 ab570236aa655ab1016fd2625055c88a
BLAKE2b-256 b8d2d8bc808ef1504303a96880ff2870534d1f54b3511d85463ce065d7b16279

See more details on using hashes here.

File details

Details for the file gaston-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: gaston-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for gaston-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7e7f326d8a028b81ac4068d0da1f3eaa3e8221eb2137ff032616db97bffd0827
MD5 1a19ce153407743f70c771a5feb66ffa
BLAKE2b-256 4b85e8c82e2bcd489eac1392c5526ab99ef33cd18d0c2ffbb785dc86595c9a26

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page