No project description provided
Project description
TimeCoder
timecoder.py - is a pipeline for division uploaded subtitles to blocks based on threshold of cosinus similarity
Inside the script there are 2 approaches:
- first summarization of subtitles followed by calculation of cosinus similarity
- first calculation of cosinus similarity followed by division by blocks and then summarization of each block
parse_subs.py - is a parser of YouTube subtitles converting them to pd.DataFrame sentence_similarity.py - script for calculation of cosinus similarity gpt_shortening.py - script for summarization
Different models for summarization and Sentence Similarity were compared. For similarity now we are using "IlyaGusev/mbart_ru_sum_gazeta". For Sentence Similarity the model called 'symanto/sn-xlm-roberta-base-snli-mnli-anli-xnli'.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file blockdivision-0.1.0.tar.gz
.
File metadata
- Download URL: blockdivision-0.1.0.tar.gz
- Upload date:
- Size: 3.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.2 CPython/3.8.10 Linux/5.10.102.1-microsoft-standard-WSL2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 035d66c53f645398386bfc71ddeadd4832d32b0e7b9c5dc0ae490daaaa3d6ec3 |
|
MD5 | 85d9c2b5866a5edab2749cb1a111b820 |
|
BLAKE2b-256 | 7f022fe3b6f1c740650fce83d64dee4bc81bbebaa533d1349b91cf1527d690cb |
File details
Details for the file blockdivision-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: blockdivision-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.2 CPython/3.8.10 Linux/5.10.102.1-microsoft-standard-WSL2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3e6c468b1bc394659467226c35a09cb75f0c839974d96461086dc563c3ccb31f |
|
MD5 | a82aa72f545e3fc0e5d359afe98e5370 |
|
BLAKE2b-256 | 7bc139b1efa46c70539f3e38144bca580096fabca96ea26b4d41934e88d74465 |