Python utility for creating video out of their subtitle files
Project description
Videogrep
=========
Videogrep searches through dialog in video files (using .srt subtitle tracks or pocketsphinx transcriptions) and makes supercuts based on what it finds.
##Requirements
Install with pip
```
pip install videogrep
```
Install [ffmpeg](http://ffmpeg.org/) with Ogg/Vorbis support. If you're on a mac with homebrew you can install ffmpeg with:
```
brew install ffmpeg --with-libvpx --with-libvorbis
```
(OPTIONAL) Install pocketsphinx for word-level transcriptions. On a mac:
```
brew tap watsonbox/cmu-sphinx
brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxbase
brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxtrain # optional
brew install --HEAD watsonbox/cmu-sphinx/cmu-pocketsphinx
```
##How to use it
The most basic use:
```
videogrep --input path/to/video_or_folder --search 'search phrase'
```
You can put any regular expression in the search phrase.
You can also search for part-of-speech tags using Pattern. See the [Pattern-Search documentation](http://www.clips.ua.ac.be/pages/pattern-search) for some details about how this works, and the [Penn Tree bank tag set](http://www.clips.ua.ac.be/pages/mbsp-tags) for a list of usuable part-of-speech tags. For example the following will search for every line of dialog that contains an adjective (JJ) followed by a singular noun (NN):
```
videogrep --input path/to/video_or_folder --search 'JJ NN' --search-type pos
```
You can also do a [hypernym](https://en.wikipedia.org/wiki/Hypernym) search - which essentially searches for words that fit into a specific category. The following, for example, will search for any line of dialog that references a liquid (like water, coffee, beer, etc.):
```
videogrep --input path/to/video_or_folder --search 'liquid' --search-type hyper
```
**NOTE: videogrep requires the subtitle track and the video file to have the exact same name, up to the extension.** For example, my_movie.mp4 and my_movie.srt will work, my_movie.mp4 and my_movie_subtitle.srt will not work.
###Options
videogrep can take a number of options:
####--input / -i
Video or subtitle file, or folder containing multiple files
####--output / -o
Name of the file to generate. By default this is "supercut.mp4"
####--search / -s
Search term
####--search-type / -st
Type of search you want to perform. There are three options:
* re: [regular expression](http://www.pyregex.com/) (this is the default).
* pos: part of speech search (uses [pattern.search](http://www.clips.ua.ac.be/pages/pattern-search)). For example 'JJ NN' would return all lines of dialog that contain an adjective followed by a noun.
* hyper: hypernym search. For example 'body parts' grabs all lines of dialog that reference a body part
* word: extract individual words - for multiple words use the '|' symbol (requires pocketsphinx).
* franken: create a "frankenstein" sentence (requires pocketsphinx)
* fragment: multiple words with allowed wildcards like 'blue \*' (requires pocketsphinx)
####--max-clips / -m
Maximum number of clips to use for the supercut
####--demo / -d
Show the search results without making the supercut
####--randomize / -r
Randomize the order of the clips
####--padding / -p
Padding in milliseconds to add to the start and end of each clip
####--transcribe / -tr
Transcribe the video using audiogrep/pocketsphinx. You must install pocketsphinx first!
####--use-transcript / -t
Use the pocketsphinx transcript rather than a subtitle file for searching. If this is enabled you can do
word-level searches.
##Samples
* [All the instances of the phrase "time" in the movie "In Time"](https://www.youtube.com/watch?v=PQMzOUeprlk)
* [All the one to two second silences in "Total Recall"](https://www.youtube.com/watch?v=qEtEbXVbYJQ)
* [The President's former press secretary telling us what he can tell us](https://www.youtube.com/watch?v=D7pymdCU5NQ)
###You can now import videogrep into your own modules and search for terms programmatically!
```
from videogrep import videogrep
videogrep.videogrep('path/to/your/files','output_file_name.mp4', 'search_term', 'search_type')
```
The videogrep module accepts the same parameters as the cli! To see the usage just read the source!
=========
Videogrep searches through dialog in video files (using .srt subtitle tracks or pocketsphinx transcriptions) and makes supercuts based on what it finds.
##Requirements
Install with pip
```
pip install videogrep
```
Install [ffmpeg](http://ffmpeg.org/) with Ogg/Vorbis support. If you're on a mac with homebrew you can install ffmpeg with:
```
brew install ffmpeg --with-libvpx --with-libvorbis
```
(OPTIONAL) Install pocketsphinx for word-level transcriptions. On a mac:
```
brew tap watsonbox/cmu-sphinx
brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxbase
brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxtrain # optional
brew install --HEAD watsonbox/cmu-sphinx/cmu-pocketsphinx
```
##How to use it
The most basic use:
```
videogrep --input path/to/video_or_folder --search 'search phrase'
```
You can put any regular expression in the search phrase.
You can also search for part-of-speech tags using Pattern. See the [Pattern-Search documentation](http://www.clips.ua.ac.be/pages/pattern-search) for some details about how this works, and the [Penn Tree bank tag set](http://www.clips.ua.ac.be/pages/mbsp-tags) for a list of usuable part-of-speech tags. For example the following will search for every line of dialog that contains an adjective (JJ) followed by a singular noun (NN):
```
videogrep --input path/to/video_or_folder --search 'JJ NN' --search-type pos
```
You can also do a [hypernym](https://en.wikipedia.org/wiki/Hypernym) search - which essentially searches for words that fit into a specific category. The following, for example, will search for any line of dialog that references a liquid (like water, coffee, beer, etc.):
```
videogrep --input path/to/video_or_folder --search 'liquid' --search-type hyper
```
**NOTE: videogrep requires the subtitle track and the video file to have the exact same name, up to the extension.** For example, my_movie.mp4 and my_movie.srt will work, my_movie.mp4 and my_movie_subtitle.srt will not work.
###Options
videogrep can take a number of options:
####--input / -i
Video or subtitle file, or folder containing multiple files
####--output / -o
Name of the file to generate. By default this is "supercut.mp4"
####--search / -s
Search term
####--search-type / -st
Type of search you want to perform. There are three options:
* re: [regular expression](http://www.pyregex.com/) (this is the default).
* pos: part of speech search (uses [pattern.search](http://www.clips.ua.ac.be/pages/pattern-search)). For example 'JJ NN' would return all lines of dialog that contain an adjective followed by a noun.
* hyper: hypernym search. For example 'body parts' grabs all lines of dialog that reference a body part
* word: extract individual words - for multiple words use the '|' symbol (requires pocketsphinx).
* franken: create a "frankenstein" sentence (requires pocketsphinx)
* fragment: multiple words with allowed wildcards like 'blue \*' (requires pocketsphinx)
####--max-clips / -m
Maximum number of clips to use for the supercut
####--demo / -d
Show the search results without making the supercut
####--randomize / -r
Randomize the order of the clips
####--padding / -p
Padding in milliseconds to add to the start and end of each clip
####--transcribe / -tr
Transcribe the video using audiogrep/pocketsphinx. You must install pocketsphinx first!
####--use-transcript / -t
Use the pocketsphinx transcript rather than a subtitle file for searching. If this is enabled you can do
word-level searches.
##Samples
* [All the instances of the phrase "time" in the movie "In Time"](https://www.youtube.com/watch?v=PQMzOUeprlk)
* [All the one to two second silences in "Total Recall"](https://www.youtube.com/watch?v=qEtEbXVbYJQ)
* [The President's former press secretary telling us what he can tell us](https://www.youtube.com/watch?v=D7pymdCU5NQ)
###You can now import videogrep into your own modules and search for terms programmatically!
```
from videogrep import videogrep
videogrep.videogrep('path/to/your/files','output_file_name.mp4', 'search_term', 'search_type')
```
The videogrep module accepts the same parameters as the cli! To see the usage just read the source!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
videogrep-0.4.1.tar.gz
(14.4 kB
view hashes)
Built Distribution
videogrep-0.4.1-py2-none-any.whl
(17.8 kB
view hashes)
Close
Hashes for videogrep-0.4.1-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f3be25679f703f2e962c069ce67584baa24fd29cdb47dc452beefec8e2c5c0d |
|
MD5 | 1a7d3c1b610f496a53e99ceb39f3a98f |
|
BLAKE2b-256 | 82875498a18fc25a32593ffe10f1bda1d8ae7529697aab5f2fcf20c693260c76 |