Human-readable Microsoft Teams meeting transcripts
Project description
Teams transcript formatter
The purpose of this package is to make Microsoft Teams meeting transcripts easier to read and analyse using tools such as QualCoder.
It processes .vtt transcripts downloaded from Microsoft Teams/Stream, merges adjacent blocks from the same speaker, and outputs a clean, formatted text file. Speaker names can optionally be renamed and assigned prefixes, and the output format is customisable via a template.
Installation
This package is available on PyPI.
Run with uvx
No installation required — run it once-off with uvx:
uvx teams-transcript-formatter transcript.vtt
Install with pip or uv
Install from PyPI:
pip install teams-transcript-formatter
# or
uv tool install teams-transcript-formatter
After installation, teams-transcript-formatter will be available on your PATH:
teams-transcript-formatter transcript.vtt
From source
If you want to make changes to the source code you can clone the repository and install in editable mode:
git clone https://github.com/jmarshrossney/teams-transcript-formatter
cd teams-transcript-formatter
uv sync
Usage
Command-line tool
The teams-transcript-formatter script takes one or more .vtt files and prints the formatted output to stdout. To save the output to .txt files instead (with the naming convention <original_stem>_formatted.txt), use the -o flag to specify an output directory.
# Basic: keep original speaker names, default formatting
teams-transcript-formatter transcript.vtt
# Rename speakers (e.g. for an interview)
teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
--prefix "Interviewer=> " --prefix "Student=< " \
transcript.vtt
# Custom output format
teams-transcript-formatter \
--rename "John Smith=JS" --rename "Jane Doe=JD" \
--template "{speaker}: {speech} [{timestamp}]" \
transcript.vtt
Run teams-transcript-formatter -h for full guidance, including shell completion.
Flags
| Flag | Description |
|---|---|
--rename |
Map original speaker names to display names: "OriginalName=DisplayName". Repeat for each speaker. |
--prefix |
Assign a prefix to each display name: "DisplayName=>". Repeat for each speaker. |
--template |
Python format string for output. Placeholders: {prefix}, {speaker}, {speech}, {timestamp}. |
-o, --output |
Directory to save .txt files. If not given, prints to stdout. |
--force |
Overwrite existing output files instead of refusing |
-q, --quiet |
Suppress all non-error output |
--version |
Show the version and exit |
-h, --help |
Show the help message and exit |
Examples
Say we have a Teams transcript file named transcript.vtt:
$ head -11 transcript.vtt
WEBVTT
91b3f3c3-44c6-4a8b-8c0a-add105d816bd/32-0
00:00:10.087 --> 00:00:13.130
<v John Smith>Hello, I am the interviewer.</v>
91b3f3c3-44c6-4a8b-8c0a-add105d816bd/32-1
00:00:13.130 --> 00:00:16.270
<v Jane Doe>Nice. I am the student being interviewed,
and I have many things to say.</v>
Default format
No flags — original speaker names, default template, print to stdout.
$ teams-transcript-formatter transcript.vtt
John Smith | Hello, I am the interviewer. | 00:00:10
Jane Doe | Nice. I am the student being interviewed, and I have many things to say. | 00:00:13
Rename speakers
Map original names to display names with --rename.
$ teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
Interviewer | Hello, I am the interviewer. | 00:00:10
Student | Nice. I am the student being interviewed, and I have many things to say. | 00:00:13
Add prefixes
Combine --rename with --prefix to visually distinguish speakers. Prefixes are keyed on the display name (after renaming).
$ teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
--prefix "Interviewer=> " --prefix "Student=< " \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
> Interviewer | Hello, I am the interviewer. | 00:00:10
< Student | Nice. I am the student being interviewed, and I have many things to say. | 00:00:13
Custom output template
Control the output format with --template. Available placeholders: {prefix}, {speaker}, {speech}, {timestamp}.
$ teams-transcript-formatter \
--rename "John Smith=JS" --rename "Jane Doe=JD" \
--template "[{timestamp}] {speaker}: {speech}" \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
[00:00:10] JS: Hello, I am the interviewer.
[00:00:13] JD: Nice. I am the student being interviewed, and I have many things to say.
Full customisation
All three flags together — rename, prefix, and template.
$ teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
--prefix "Interviewer=> " --prefix "Student=< " \
--template "{prefix}{speaker}: {speech} [{timestamp}]" \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
> Interviewer: Hello, I am the interviewer. [00:00:10]
< Student: Nice. I am the student being interviewed, and I have many things to say. [00:00:13]
Selective prefixes
Pass an empty value to --prefix to suppress the prefix for a given speaker.
$ teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
--prefix "Interviewer=> " --prefix "Student=" \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
> Interviewer | Hello, I am the interviewer. | 00:00:10
Student | Nice. I am the student being interviewed, and I have many things to say. | 00:00:13
Privacy
Speaker names can be replaced using the --rename flag. All other redactions of sensitive and identifiable information must be performed before running this script.
Tip: the auto-generated transcripts can be edited in-situ using the Microsoft Stream app.
Remember to delete the original transcripts after running this script!
Roadmap & contributing
There are some fairly simple additions that would make this more generally useful:
- Handle meetings with >2 participants
- User can configure how names are handled
- Configure the output format, e.g. using a template
- Handle Zoom meetings
- Output to different file formats (realistically,
.docxwould probably be the most useful to folks.)
Suggestions for improvements are welcome. Contributions even more so! Just open an issue or pull request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file teams_transcript_formatter-0.3.2.tar.gz.
File metadata
- Download URL: teams_transcript_formatter-0.3.2.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
216ddab7d86cab0125829b44d14d61258cdc5c54f82f3629d5b8e3fcbfa066fa
|
|
| MD5 |
657d1a3605c3a7e94eb8a404add2606f
|
|
| BLAKE2b-256 |
57172101524354e98607455db8c48c1d2b601340ffa63d753cf512fcec24144a
|
File details
Details for the file teams_transcript_formatter-0.3.2-py3-none-any.whl.
File metadata
- Download URL: teams_transcript_formatter-0.3.2-py3-none-any.whl
- Upload date:
- Size: 8.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d6178ad6f9b8289b8ef3261e2eca6dfc0afe914709529ab1e41bf44a06743d7
|
|
| MD5 |
bf7b428f5b9eb425d3f4d6304e1804a7
|
|
| BLAKE2b-256 |
6d1aed7a6a5a11c151645e43fa5b1971c890b6f500a4ece9c6f7d40e5296a1c5
|