Analyze, process, and extract from many types of input data. Highly modular/customizable.

These details have not been verified by PyPI

Project links

Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Multimedia :: Sound/Audio :: Analysis

Project description

Taters!

🥔 TATERS: Takes All Things, Extracts Relevant Stuff

Taters is a Python toolkit (and CLI) for getting from raw media to analysis-ready artifacts — fast, repeatable, and with predictable outputs. Point it at video, audio, or text and it helps you build end-to-end workflows: extract WAV from video, diarize and transcribe, compute embeddings, run dictionary/archetype analyses, then gather everything into tidy datasets you can model or visualize.

🥔 Documentation: https://www.taters.wiki
🥔 Status: early but usable; APIs will probably evolve. Pin versions if you need stability.

What Taters is (and is not)

Is: A library + CLI with small, composable functions and an optional YAML pipeline runner. Predictable I/O, friendly defaults, and “do not overwrite unless asked.”
Is not: A single black-box pipeline. You keep control of each step and can run pieces à la carte or all at once.
Is not: Edible.

A tiny taste of Taters

Python

from taters import Taters
t = Taters()

# Pull audio from video
wavs = t.audio.extract_wavs_from_video(input_path="input.mp4")

# Diarize & transcribe (CSV/SRT/TXT)
diar = t.audio.diarize_with_thirdparty(audio_path=wavs[0], device="auto")

# Features (defaults write under ./features/<kind>/)
t.audio.extract_whisper_embeddings(source_wav=wavs[0], transcript_csv=diar["csv"])
t.text.analyze_with_dictionaries(csv_path=diar["csv"], dict_paths=["dictionaries/liwc"])
t.text.analyze_with_archetypes(csv_path=diar["csv"], archetype_csvs=["archetypes/Resilience.csv"])

CLI

# Whisper embeddings over non-silent spans, then mean-pool
python -m taters.audio.extract_whisper_embeddings \
  --source_wav audio/session.wav --strategy nonsilent --aggregate mean

For more examples (including per-speaker splits, sentence embeddings, and end-to-end pipelines), see the Guides in the docs.

Installation

Use a fresh virtual environment. Then follow the step-by-step install guide (CPU or CUDA, FFmpeg, optional diarization extras): 👉 https://www.taters.wiki/install-guide

Pipelines

When you are ready to batch a whole dataset, use the YAML runner to chain steps and control concurrency:

python -m taters.pipelines.run_pipeline \
  --root_dir videos --file_type video \
  --preset conversation_video \
  --workers 8 --var device=cuda

Details, presets, and how to write your own: 👉 https://www.taters.wiki/guides/pipelines/

Contributing

Bug reports and pull requests are welcome. If you are using Taters on real projects, feedback on rough edges and missing presets is especially valuable.

License

MIT. See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Multimedia :: Sound/Audio :: Analysis

Release history Release notifications | RSS feed

0.1.95

Nov 7, 2025

0.1.94

Oct 10, 2025

This version

0.1.93

Oct 2, 2025

0.1.92

Oct 1, 2025

0.1.91

Oct 1, 2025

0.1.9

Sep 29, 2025

0.1.8

Sep 27, 2025

0.1.7

Sep 27, 2025

0.1.6

Sep 27, 2025

0.1.5

Sep 26, 2025

0.1.4

Sep 26, 2025

0.1.3

Sep 25, 2025

0.1.2

Sep 24, 2025

0.1.1

Sep 22, 2025

0.1.0

Sep 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

taters-0.1.93.tar.gz (101.9 kB view details)

Uploaded Oct 2, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

taters-0.1.93-py3-none-any.whl (128.8 kB view details)

Uploaded Oct 2, 2025 Python 3

File details

Details for the file taters-0.1.93.tar.gz.

File metadata

Download URL: taters-0.1.93.tar.gz
Upload date: Oct 2, 2025
Size: 101.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.5

File hashes

Hashes for taters-0.1.93.tar.gz
Algorithm	Hash digest
SHA256	`06200d8de5e657314a6c48b5e014e6819d1a3bf969ee1279c6f41fe43a57f055`
MD5	`68402659cb0ef43d7f8cfd2585821937`
BLAKE2b-256	`b67b753ddb7578fd236b5e0d8dbbee4d02f268f74b2b4d9d80517cd716f1227e`

See more details on using hashes here.

File details

Details for the file taters-0.1.93-py3-none-any.whl.

File metadata

Download URL: taters-0.1.93-py3-none-any.whl
Upload date: Oct 2, 2025
Size: 128.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.5

File hashes

Hashes for taters-0.1.93-py3-none-any.whl
Algorithm	Hash digest
SHA256	`05c23a4f7af07a88cee3e3869e4aff982c0ef4bd67e5ad72a527f386be16f1c4`
MD5	`db37cc341b546f5f230687d1ff5bb4f7`
BLAKE2b-256	`90b6a1012103806ffd04525ca65838af636157b14e1fec3e82eecda2dcf1ddc0`

See more details on using hashes here.

taters 0.1.93

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🥔 TATERS: Takes All Things, Extracts Relevant Stuff

What Taters is (and is not)

A tiny taste of Taters

Python

CLI

Installation

Pipelines

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes