Skip to main content

Client library for communicating with LaBB-CAT servers

Project description

nzilbb-labbcat

DOI

Client library for communicating with LaBB-CAT servers using Python.

e.g.

import labbcat

# Connect to the LaBB-CAT corpus
corpus = labbcat.LabbcatView("https://labbcat.canterbury.ac.nz/demo", "demo", "demo")

# Find all tokens of a word
matches = corpus.getMatches({"orthography":"quake"})

# Get the recording of that utterance
audio = corpus.getSoundFragments(matches)

# Get Praat TextGrids for the utterances
textgrids = corpus.getFragments(
    matches, ["utterance", "word","segment"],
    "text/praat-textgrid")

LaBB-CAT is a web-based linguistic annotation store that stores audio or video recordings, text transcripts, and other annotations.

Annotations of various types can be automatically generated or manually added.

LaBB-CAT servers are usually password-protected linguistic corpora, and can be accessed manually via a web browser, or programmatically using a client library like this one.

The current version of this library requires LaBB-CAT version 20220307.1126.

API documentation is available at https://nzilbb.github.io/labbcat-py/

Basic usage

nzilbb-labbcat is available in the Python Package Index here

To install the module:

pip install nzilbb-labbcat

The following example shows how to:

  1. upload a transcript to LaBB-CAT,
  2. wait for the automatic annotation tasks to finish,
  3. extract the annotation labels, and
  4. delete the transcript from LaBB-CAT.
import labbcat

# Connect to the LaBB-CAT corpus
corpus = labbcat.LabbcatEdit("http://localhost:8080/labbcat", "labbcat", "labbcat")

# List the corpora on the server
corpora = corpus.getCorpusIds()

# List the transcript types
transcript_type_layer = corpus.getLayer("transcript_type")
transcript_types = transcript_type_layer["validLabels"]

# Upload a transcript
corpus_id = corpora[0]
transcript_type = next(iter(transcript_types))
taskId = corpus.newTranscript(
    "test/labbcat-py.test.txt", None, None, transcript_type, corpus_id, "test")

# wait for the annotation generation to finish
corpus.waitForTask(taskId)
corpus.releaseTask(taskId)

# get the "POS" layer annotations
annotations = corpus.getAnnotations("labbcat-py.test.txt", "pos")
labels = list(map(lambda annotation: annotation["label"], annotations))

# find all /a/ segments (phones) in the whole corpus
results = corpus.getMatches({ "segment" : "a" })

# get the start/end times of the segments
segments = corpus.getMatchAnnotations(results, "segment", offsetThreshold=50)

# get F1/F2 at the midpoint of each /a/ vowel
formantsAtMidpoint = corpus.processWithPraat(
  labbcat.praatScriptFormants(), 0.025, results, segments)

# delete tha transcript from the corpus
corpus.deleteTranscript("labbcat-py.test.txt")

For batch uploading and other example code, see the examples subdirectory.

Developers

Create a virtual environment (once only):

python3 -m venv labbcat-env

Before running any of the commands below:

source labbcat-env/bin/activate

To build, test, release, and document the module, the following prerequisites are required:

  • pip3 install twine
  • pip3 install pathlib
  • pip3 install deprecated
  • pip3 install setuptools
  • sudo apt install python3-sphinx

Unit tests

python3 -m unittest

...or for specific test suites:

python3 -m unittest test.TestLabbcatAdmin

... or for specific tests:

python3 -m unittest test.TestLabbcatEdit.test_generateLayerUtterances

Documentation generation

cd docs
make clean
make

Publishing

rm dist/*
python3 setup.py sdist bdist_wheel
twine check dist/*
twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nzilbb_labbcat-1.1.1.tar.gz (54.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nzilbb_labbcat-1.1.1-py3-none-any.whl (55.6 kB view details)

Uploaded Python 3

File details

Details for the file nzilbb_labbcat-1.1.1.tar.gz.

File metadata

  • Download URL: nzilbb_labbcat-1.1.1.tar.gz
  • Upload date:
  • Size: 54.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for nzilbb_labbcat-1.1.1.tar.gz
Algorithm Hash digest
SHA256 b5e73d4e14895fe609db425e81dc99317b617e8ba912b6489f9d1336d006a976
MD5 30fa87ef08d36ed94e54f5c59a3a9524
BLAKE2b-256 32c384dc72c3554faf2a59c8f0aacb993c464979e8548a8fa09cf3003cc32476

See more details on using hashes here.

File details

Details for the file nzilbb_labbcat-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: nzilbb_labbcat-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 55.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for nzilbb_labbcat-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 925e129e1c4e0a0b77a64d62903f6db4accb86ca368cfb31f626599106de6221
MD5 df78e550a5b97f352a5a50824d644a83
BLAKE2b-256 ede097226dc1ebedd77d6375099383f766e9a56b0a8b17b47172ac013a4b46d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page