Transcribe, chunk and summarize podcasts (FastAPI + Whisper + OpenAI)

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

poloniki

These details have not been verified by PyPI

Project description

Quint: transcribe | chunk | summarize

Quint logo

"Quint" is designed to enhance the podcast experience. It simplifies the process for users, making it easier for them to understand and navigate podcasts by providing concise summaries, highlights, and transcripts.

Main Functionality
Quickstart
License
Deploy on a GPU cloud

🚀 Main Functionality

Below is a list of the core API endpoints offered by Quint:

Once the API is running (see Quickstart), interactive docs are available at /docs.

🎥 YouTube Video Transcription

Provide a YouTube video ID. Quint fetches the video, extracts its audio, and returns a transcription.

GET /youtube_transcript?video_id=YOUR_YOUTUBE_VIDEO_ID

{ "transcript": "The transcribed text of the video goes here..." }

🎙️ Transcription from Audio File

Upload an audio file and receive its transcription in text format.

POST /file_transcript

{ "transcript": "The transcribed text of the audio goes here..." }

📜 Text Chunking

Submit a lengthy text and get it divided into semantically meaningful chunks or paragraphs.

POST /chunk
{ "body": "Your lengthy continuous text here..." }

{ "output": ["Chunk 1", "Chunk 2", "..."] }

🌟 Highlight the Best Sentence

Submit a text and Quint returns the index of the most descriptive sentence based on the embeddings.

POST /best_sentence
{ "body": "Your raw text here..." }

{ "best_sentence_index": 5 }

📝 YouTube Summary

Provide a YouTube video ID to get back a list of chunked summaries of the video.

GET /youtube_summarize?video_id=YOUR_YOUTUBE_VIDEO_ID

{ "summary": ["Summary of part 1", "Summary of part 2", "..."] }

🧑‍💻 Quickstart

Run the API locally — CPU is fine for chunking and summarization; transcription is far faster on a GPU (see deploy).

git clone https://github.com/poloniki/quint.git
cd quint
make install              # pip install -e .
cp env.sample .env        # then set OPENAI_API_KEY
make run_api              # serves on http://localhost:8083

Then open http://localhost:8083/docs for the interactive API docs.

Web UI (optional)

A small Streamlit frontend lives in frontend/. With the API running:

pip install -r frontend/requirements.txt
streamlit run frontend/app.py

Set QUINT_API_URL if the API isn't on http://localhost:8083.

📖 License

This project is licensed under the MIT License - see the LICENSE file for details.

🛜 How to deploy this API on cloud

Important note: I highly recommend using the JAX solution, as it is much faster than the OpenAI-proposed way. Please refer to this repo Whisper JAX for more details. I will attach one of the tables from that repo:

Table 1: Average inference time in seconds for audio files of increasing length. GPU device is a single A100 40GB GPU. TPU device is a single TPU v4-8.

	OpenAI	Transformers	Whisper JAX	Whisper JAX

Framework	PyTorch	PyTorch	JAX	JAX
Backend	GPU	GPU	GPU	TPU

1 min	13.8	4.54	1.72	0.45
10 min	108.3	20.2	9.38	2.01
1 hour	1001.0	126.1	75.3	13.8

Choosing a GPU cloud provider

Quint runs on any machine with an NVIDIA GPU, so you are free to use whichever cloud provider (AWS, GCP, Azure, Lambda, Paperspace, RunPod, …) or on-prem hardware you prefer. For the best price/performance on transcription, look for an Ada-generation card such as the RTX 6000 Ada or A6000 — these are typically far cheaper than A100-class GPUs while offering comparable CUDA compute capability.

Whatever you pick, you only need an instance that provides:

An NVIDIA GPU (Ampere/Ada or newer recommended)
Ubuntu 22.04 (or similar) with CUDA 12 and Docker
SSH access (root or sudo)

The steps below are provider-neutral: provision the instance however your provider requires, then follow along.

1. Configure your environment

cp env.sample .env        # then edit .env
direnv reload             # or: source .env

Set the following in .env:

Variable	Used by	Purpose
`OPENAI_API_KEY`	API (summarization)	Key for the summarization step
`GPU_TYPE`	API (optional)	Set to `A100` to enable bfloat16 on the JAX backend; any other value (or unset) uses float16
`EMAIL`	deploy helper	Labels / generates your SSH key
`HOST`	deploy helper	Public IP or hostname of your GPU instance
`SSH_USER`	deploy helper	SSH login user for your image (often `root`, but `ubuntu` on AWS, your username on GCP, `azureuser` on Azure)

2. Provision and connect to the instance

Create a GPU instance with your provider using an Ubuntu 22.04 + CUDA 12 + Docker image and your SSH public key. Once it is running, note its public IP (set it as HOST in .env) and connect:

ssh $SSH_USER@$HOST -i ~/.ssh/<your_key>

Use the login user your provider specifies for the image. root works on many bare-VM providers, but AWS Ubuntu AMIs use ubuntu, GCP uses your username, Azure uses azureuser, etc. Set it as SSH_USER in .env.

The notebook notebooks/Deploy_gpu_instance.ipynb automates the provider-neutral parts: generating an SSH key, copying the code to the host, and building/running the container.

3. Install NVIDIA drivers (if your image doesn't include them)

If the instance image already ships with working drivers, skip this. Otherwise run the bundled script on the instance and reboot to load them:

bash scripts/install_nvidia_driver.sh
sudo reboot

4. Get the code onto the instance

Clone it directly:

git clone https://github.com/poloniki/quint.git
cd quint

…or copy your local checkout up with scp (the deploy notebook does this for you).

5. Build and run

docker build -t quint --file Dockerfile.jax .
docker run --gpus all -p 80:80 --shm-size=1g --env-file .env quint

The --env-file .env flag passes OPENAI_API_KEY (and optional GPU_TYPE) into the container, so make sure .env is present on the instance. Also ensure your provider's firewall / security group allows inbound TCP on port 80 — most clouds only open SSH (port 22) by default.

Your API is now available on the instance's public IP (port 80).

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

poloniki

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.2

Jun 17, 2026

This version

1.1

Jun 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quintessentia-1.1.tar.gz (19.6 kB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

quintessentia-1.1-py3-none-any.whl (18.2 kB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file quintessentia-1.1.tar.gz.

File metadata

Download URL: quintessentia-1.1.tar.gz
Upload date: Jun 16, 2026
Size: 19.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quintessentia-1.1.tar.gz
Algorithm	Hash digest
SHA256	`c0c81b1ae268067b32ad0b9b954c64c1f259e28ce09caf697442245ff88d000a`
MD5	`7db5fc56a83cae3d190b3a4de7564159`
BLAKE2b-256	`18e6e7ee8a8ac1df1827c4ef89a08b2cb3688d59a862b34aa0c99767bdcf4b28`

See more details on using hashes here.

Provenance

The following attestation bundles were made for quintessentia-1.1.tar.gz:

Publisher: publish.yml on poloniki/quint

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: quintessentia-1.1.tar.gz
- Subject digest: c0c81b1ae268067b32ad0b9b954c64c1f259e28ce09caf697442245ff88d000a
- Sigstore transparency entry: 1837794001
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: poloniki/quint@c391456c63c6528e7d144f3dfb431d4372b14bdf
- Branch / Tag: refs/tags/v1.1
- Owner: https://github.com/poloniki
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c391456c63c6528e7d144f3dfb431d4372b14bdf
- Trigger Event: release

File details

Details for the file quintessentia-1.1-py3-none-any.whl.

File metadata

Download URL: quintessentia-1.1-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 18.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quintessentia-1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2f1f2d4d107f9dda8d739ffd7a82069dce818f5ba67c7dbb663dca4f433165b1`
MD5	`98364a86918a9ceb9cca785dc9a6e402`
BLAKE2b-256	`d66c15aba04d9748784f5091679eaf13bb348a88f333b6174c42edbd5f1d31aa`

See more details on using hashes here.

Provenance

The following attestation bundles were made for quintessentia-1.1-py3-none-any.whl:

Publisher: publish.yml on poloniki/quint

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: quintessentia-1.1-py3-none-any.whl
- Subject digest: 2f1f2d4d107f9dda8d739ffd7a82069dce818f5ba67c7dbb663dca4f433165b1
- Sigstore transparency entry: 1837794254
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: poloniki/quint@c391456c63c6528e7d144f3dfb431d4372b14bdf
- Branch / Tag: refs/tags/v1.1
- Owner: https://github.com/poloniki
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c391456c63c6528e7d144f3dfb431d4372b14bdf
- Trigger Event: release

quintessentia 1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Quint: transcribe | chunk | summarize

Table of Contents

🚀 Main Functionality

🎥 YouTube Video Transcription

🎙️ Transcription from Audio File

📜 Text Chunking

🌟 Highlight the Best Sentence

📝 YouTube Summary

🧑‍💻 Quickstart

Web UI (optional)

📖 License

🛜 How to deploy this API on cloud

Choosing a GPU cloud provider

1. Configure your environment

2. Provision and connect to the instance

3. Install NVIDIA drivers (if your image doesn't include them)

4. Get the code onto the instance

5. Build and run

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance