Skip to main content

Transcript annotator for speaker-scoped CHAT corpus correction

Project description

talk-tag

Adapter-only TalkBank CHAT morphosyntactic error annotator for .cha and .jsonl.

The runtime deployment path is fixed to:

  1. Base model: unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
  2. Adapter: mash-mash/Llama_TalkTag_CHAT_error_annotator_adapter

No merged-model runtime path is used.

Install

Python requirement: >=3.10.

pip install "talk-tag[runtime]"

Runtime extras include torch, transformers, and peft.

Hugging Face access

You need Hub access to both repositories above. Set a token before first run:

export HF_TOKEN=...

If token or access is missing, talk-tag doctor/talk-tag model pull will report auth or gated-repo errors.

First-run workflow

  1. Check environment:
talk-tag doctor
  1. Pull/warm model assets:
talk-tag model pull --device auto
  1. Run annotation:
talk-tag annotate \
  --input-dir ./input \
  --output-dir ./output \
  --target-speaker "*CHI" \
  --device auto

Inference defaults

  • batch_size = 4
  • max_new_tokens = 128
  • max_seq_length = 512
  • max_context_chars = 1200
  • limit = 0
  • greedy decoding (do_sample = false)

Supported runtime inputs

  • .cha
  • .jsonl (requires --speaker-field and --text-field)

Other previously supported formats (.txt, .csv, .json, .xlsx) are rejected in adapter-only deployment mode.

Colab quickstart

See examples/colab_quickstart.ipynb for a minimal setup flow.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

talk_tag-0.2.0.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

talk_tag-0.2.0-py3-none-any.whl (35.1 kB view details)

Uploaded Python 3

File details

Details for the file talk_tag-0.2.0.tar.gz.

File metadata

  • Download URL: talk_tag-0.2.0.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.9 {"installer":{"name":"uv","version":"0.9.9"},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for talk_tag-0.2.0.tar.gz
Algorithm Hash digest
SHA256 892b611598c7dbe66f3bacad2bf7b2fea32f359332e213cad7d4c3b7ab8775a7
MD5 20e2a9b62765ac321985708803796d6f
BLAKE2b-256 251aaf0c1c97daa61dc63b1a5a742d5a7c7ff8758de6736ed906a59b7364032c

See more details on using hashes here.

File details

Details for the file talk_tag-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: talk_tag-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 35.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.9 {"installer":{"name":"uv","version":"0.9.9"},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for talk_tag-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2029da7215bd533e180aec9538d567881540ca7a0fb87be686ad78c8a5bf5646
MD5 29d29a883a174eb42b72f340c5755d3f
BLAKE2b-256 8a0aa2e5b8fb1fbd765765c986bfe98559ff7e616f3918c5dbaaf4642af56a77

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page