No project description provided
Project description
Installation from pip3
pip3 install --verbose subtitle_analyzer
python -m spacy download en_core_web_trf
python -m spacy download es_dep_news_trf
Usage
Please refer to api docs.
Excutable usage
- Write ass file with vocabulary information
sta_vocab --srtfile movie.srt --lang en --assfile en_vocab.ass --google False
- Write ass file with phrase information
sta_phrase --srtfile movie.srt --lang en --assfile en_phrase.ass --google False
Package usage
from subtitlecore import Subtitle
from subtitle_analyzer import VocabAnalyzer, PhraseAnalyzer
from subtitle_analyzer import VocabASSWriter, PhraseASSWriter
import json
def subtitle_vocab(srtfile, lang, assfile, google):
phase = {"step": 1, "msg": "Start sentenizing"}
print(json.dumps(phase), flush=True)
sf = Subtitle(srtfile, lang)
sens = sf.sentenize()
for e in sens:
print(e)
phase = {"step": 2, "msg": "Finish sentenizing"}
print(json.dumps(phase), flush=True)
analyzer = VocabAnalyzer(lang)
exs = analyzer.get_line_vocabs(sens, google)
shown = exs[:20]
phase = {"step": 3, "msg": "Finish vocabs dictionary lookup", "vocabs": shown}
print(json.dumps(phase), flush=True)
if assfile:
ass_writer = VocabASSWriter(srtfile)
ass_writer.write(exs, assfile, {"animation": False})
phase = {"step": 4, "msg": "Finish ass saving"}
print(json.dumps(phase), flush=True)
def subtitle_phrase(srtfile, lang, assfile, google):
phase = {"step": 1, "msg": "Start sentenizing"}
print(json.dumps(phase), flush=True)
sf = Subtitle(srtfile, lang)
sens = sf.sentenize()
for e in sens:
print(e)
phase = {"step": 2, "msg": "Finish sentenizing"}
print(json.dumps(phase), flush=True)
analyzer = PhraseAnalyzer(lang)
exs = analyzer.get_line_phrases(sens, google)
phase = {"step": 3, "msg": "Finish phrases dictionary lookup", "vocabs": exs[:10]}
print(json.dumps(phase), flush=True)
if assfile:
ass_writer = PhraseASSWriter(srtfile)
ass_writer.write(exs, assfile, {"animation": False})
phase = {"step": 4, "msg": "Finish ass saving"}
print(json.dumps(phase), flush=True)
Development
Clone project
git clone https://github.com/qishe-nlp/subtitle-analyzer.git
Install poetry
Install dependencies
poetry update
Test
poetry run pytest -rP
which run tests under tests/*
Execute
poetry run sta_vocab --help
poetry run sta_phrase --help
Create sphinx docs
poetry shell
cd apidocs
sphinx-apidoc -f -o source ../subtitle_analyzer
make html
python -m http.server -d build/html
Hose docs on github pages
cp -rf apidocs/build/html/* docs/
Build
- Change
version
inpyproject.toml
andsubtitle_analyzer/__init__.py
- Build python package by
poetry build
Git commit and push
Publish from local dev env
- Set pypi test environment variables in poetry, refer to poetry doc
- Publish to pypi test by
poetry publish -r test
Publish through CI
- Github action build and publish package to test pypi repo
git tag [x.x.x]
git push origin master
- Manually publish to pypi repo through github action
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for subtitle_analyzer-0.1.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3608452958e16c9d58157146be1dca6b17de192f967e3352ecee2622b1e675e9 |
|
MD5 | fce6e790e2f1e485effcfd7e5a13feef |
|
BLAKE2b-256 | 601182f4d9f015438758549dc3d636457e8b5b0927c6d8f3d382f661b76a3a2e |