## Chunkup

Given an audio file, and a tab delimited file, split up audio file into smaller chunks.

## Requirements & Caveats

## Installation

pip install chunkup

After installing chunkup, the chunkup command line script should be added to your path, and should be available from anywhere. Test it out by opening a new terminal and running

chunkup -h

## Usage

This usage example pulls down Episode #20: I Want to Break Free of Reply All. The mp3 is 41.5M. You can get replyallcreak.txt from https://github.com/JoFrhwld/chunkup/blob/master/examples/replyallcreak.txt

curl -L -o examples/reply_all_podcast.mp3 http://bit.ly/chunkup
mkdir chunks

chunkup examples/reply_all_podcast.mp3 examples/replyallcreak.txt chunks/

### Chunk naming

chunks/ now contains 104 chunks of speech I annotated, largely to separate out the three speakers in the first segment. The default naming conventions of chunks are:

[n]-[basename]-[col1]-[col3].wav

Where [n] is the numeric index of the chunk, [basename] is the base name of the original soundfile, [col1] is the value from column 1 in replyallcreak.txt (in this case, speaker ID), and [col3] is the value from column 3 in replyallcreak.txt (in this case, the onset time of the chunk in ss.ms). Available variables for naming chunks are:

naming variable

description

[n]

chunk number

[basename]

base name of original audio file

[col0-9]

value from any the given column in the chunk file

### Configuration

You can change the chunk naming convention either at the command line usig the -n or --naming flag.

$chunkup -n [col1]-[n].wav examples/reply_all_podcast.mp3 examples/replyallcreak.txt chunks/ Other config options include option description -s, --start Column index (starting with 1) for the start time, in ss.ms format -e, --end Column index (starting with 1) for the end time, in ss.ms format --header Include if chunk file has a header All config options can be defined in a config file and passed to chunkup.py with the prefix +. A sample config file for this data would be: --naming [n]-[basename]-[col1]-[col3].wav --start 3 --end 4 If saved to config.txt, it could be passed to chunkup.py like so: $ chunkup +config.txt examples/reply_all_podcast.mp3 examples/replyallcreak.txt chunks/

## Project details

