Skip to main content

Client library for communicating with LaBB-CAT servers

Project description

nzilbb-labbcat

DOI

Client library for communicating with LaBB-CAT servers using Python.

e.g.

import labbcat

# Connect to the LaBB-CAT corpus
corpus = labbcat.LabbcatView("https://labbcat.canterbury.ac.nz/demo", "demo", "demo")

# Find all tokens of a word
matches = corpus.getMatches({"orthography":"quake"})

# Get the recording of that utterance
audio = corpus.getSoundFragments(matches)

# Get Praat TextGrids for the utterances
textgrids = corpus.getFragments(
    matches, ["utterance", "word","segment"],
    "text/praat-textgrid")

LaBB-CAT is a web-based linguistic annotation store that stores audio or video recordings, text transcripts, and other annotations.

Annotations of various types can be automatically generated or manually added.

LaBB-CAT servers are usually password-protected linguistic corpora, and can be accessed manually via a web browser, or programmatically using a client library like this one.

The current version of this library requires LaBB-CAT version 20220307.1126.

API documentation is available at https://nzilbb.github.io/labbcat-py/

Basic usage

nzilbb-labbcat is available in the Python Package Index here

To install the module:

pip install nzilbb-labbcat

The following example shows how to:

  1. upload a transcript to LaBB-CAT,
  2. wait for the automatic annotation tasks to finish,
  3. extract the annotation labels, and
  4. delete the transcript from LaBB-CAT.
import labbcat

# Connect to the LaBB-CAT corpus
corpus = labbcat.LabbcatEdit("http://localhost:8080/labbcat", "labbcat", "labbcat")

# List the corpora on the server
corpora = corpus.getCorpusIds()

# List the transcript types
transcript_type_layer = corpus.getLayer("transcript_type")
transcript_types = transcript_type_layer["validLabels"]

# Upload a transcript
corpus_id = corpora[0]
transcript_type = next(iter(transcript_types))
taskId = corpus.newTranscript(
    "test/labbcat-py.test.txt", None, None, transcript_type, corpus_id, "test")

# wait for the annotation generation to finish
corpus.waitForTask(taskId)
corpus.releaseTask(taskId)

# get the "POS" layer annotations
annotations = corpus.getAnnotations("labbcat-py.test.txt", "pos")
labels = list(map(lambda annotation: annotation["label"], annotations))

# delete tha transcript from the corpus
corpus.deleteTranscript("labbcat-py.test.txt")

For batch uploading and other example code, see the examples subdirectory.

Developers

To build, test, release, and document the module, the following prerequisites are required:

  • pip3 install twine
  • pip3 install pathlib
  • pip3 install deprecated
  • sudo apt install python3-sphinx

Unit tests

python3 -m unittest

...or for specific tests:

python3 -m unittest test.TestLabbcatAdmin

Documentation generation

cd docs
make clean
make

Publishing

rm dist/*
python3 setup.py sdist bdist_wheel
twine check dist/*
twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nzilbb-labbcat-0.8.0.tar.gz (41.8 kB view details)

Uploaded Source

Built Distribution

nzilbb_labbcat-0.8.0-py3-none-any.whl (50.7 kB view details)

Uploaded Python 3

File details

Details for the file nzilbb-labbcat-0.8.0.tar.gz.

File metadata

  • Download URL: nzilbb-labbcat-0.8.0.tar.gz
  • Upload date:
  • Size: 41.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for nzilbb-labbcat-0.8.0.tar.gz
Algorithm Hash digest
SHA256 d6563c232063e481f026628d7051bedb5e3a42c600975f1860d8086d1adaf709
MD5 0d64f7a90266927d934ad7f348bc6bec
BLAKE2b-256 62234dfe74d28b4aba95cfe4e8351f6a585b6c51a10c9148bd010860d7efcdb4

See more details on using hashes here.

File details

Details for the file nzilbb_labbcat-0.8.0-py3-none-any.whl.

File metadata

File hashes

Hashes for nzilbb_labbcat-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fb321ffe66a6803cc63b459c1e5158ffb184ccdc1fa5c6ae6fe358539964a410
MD5 3e6fc3ab3c41de95b0eb1c5b142b5c39
BLAKE2b-256 6431209723882de8ad0a70015358250717572f373ca455a4d3c510a9e0ecafb4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page