Skip to main content

Screen play / drama text to multi-voice audio play converter

Project description

About

dramaTTS parses scripts (plain text files) for theatre/screen plays and converts them into a multi-voice audio plays (wave-files).

While the script parsing functionality is provided by the dramaTTS program itself, it relies on external tools for the audio processing:

SoX, Festival as well as voices and lexicons for Festival have to be installed in order to create audio output with dramaTTS (see Prerequisites ).

Licenses

dramaTTS is free software released under the GPLv3 license (see LICENSE [3] file), Copyright (c) 2020 Thies Hecker

The GUI of dramaTTS is made with PyQt [4] and setuptools_scm [5] is used to align version numbering with git.

While dramaTTS is a standalone application, it is of limited use without Festival and SoX being installed, which provide the audio rendering (only script parsing including syntax highlighting, etc. is available).

While the Festival application itself and SoX are released under free software licenses as well (see details below), specific components, which are commonly bundled with Festival (i.e. certain lexicons and voices) may be released under non-free licenses.

For instance the festlex-OALD lexicon, which can be found among other files (incl. the source code of the latest Festival release) on the Festvox 2.5 release page [7] lexicon is restricted to non-commercial use only.

The Installing Festival without non-free components section will provide an example for a Festival distribution based on free components only.

Please see COPYING [6] for details on licenses and copyright disclaimers of the individual components.

Features

dramaTTS provides 2 main components: a script parser and an audio-renderer.

The script parser features:

  • input files with minimum formatting (see Basic script format)

  • syntax highlighting (identifies different content like new scenes, dialogue lines, narrative descriptions,…)

  • text string substitutions supporting regular expressions

  • some utility functions like sorting speakers according to their number of text lines

The audio-renderer is basically a front-end to Festival and SoX. Each line of script text will be synthesized by Festival and saved to a wave-file, which is then post-processed by SoX, allowing:

  • Altering of Festival voices (pitch, tempo and volume)

  • support for multiple CPU cores to accelerate audio rendering (dispatches parallel processes for individual lines)

  • using a Festival server for rendering is supported

  • some post-processing: normalize all voices, combine audio files (lines -> scenes -> single project file)

  • (re-)rendering of individual scenes or speakers

Prerequisites

python

You will need a python3 distribution installed and for most convenience you should have either the pip or conda package manager installed.

On linux you will most likely have python and pip already installed - if not you should be able to install them with distributions package-manager.

E.g. for debian based system like ubuntu just run:

sudo apt-get python3-pip

or on arch based:

sudo pacman -S python-pip

For Windows users I would recommend to install Anaconda [9] or miniconda [10], which will provide the conda package manager (make sure to get the python3 - not the python2 - version!).

To install dramaTTS with pip:

pip install dramatts

Note, that on some distributions you may install python2 and python3 in parallel. In such cases you should make sure, that you not using a pip for your python2 environment to install dramaTTS. Eventually you need to use pip3 as a command. You can check if you are using the correct pip by calling:

pip --version

To install dramaTTS with conda:

conda install -c thecker dramatts

In both cases pip or conda should download all required dependencies and should be able to launch the program. To do that just type:

python -m dramatts.dramatts_gui

The GUI should pop up and you can import text files, define roles etc., but you will not be able render audio unless you have installed Festival (and its components) and SoX.

Installing Festival without non-free components

While many linux distributions include pre-built packages for Festival they often include non-free components like festlex-OALD. Therefore the safest way to create a free Festival distribution is to compile from source. To form a free distribution following components could be used:

  • Festival 2.5 (main application)

  • Edinburgh Speech Tools (EST) - required to compile Festival

  • festlex_CMU (lexicon)

  • festlex_POSLEX (lexicon)

  • festvox_cmu_us_slt_cg (female voice)

  • festvox_cmu_us_rms_cg (male voice)

All components can be downloaded at CMU’s (Carnegie Mellon University) Festvox 2.5 release page [7]. The source code of Festival and EST can also be cloned from the Festvox github page [8].

To compile the code follow the instructions in the INSTALL file included in Festival.

Note, that more voices can be found at the Festvox page (although some might require e.g. additional lexicons and thus won’t be working with the selected components above). Additionally voices may also be altered in tempo and pitch in dramaTTS (by post-processing with SoX) to create more than one speaker per voice.

Building Festival from source is based on the autotools-toolchain - so it shouldn’t be a problem on GNU/linux, but may be complicated on MS Windows.

Fortunately the eGuideDog team has created compile-instructions for Windows and even provides a Festival 2.5 version including precompiled binaries for Windows [11] (which does not include the problematic festlex-OALD lexicon).

In order to use Festival under Windows with dramaTTS you will need to copy the text2wave.bat (see the /utils folder [12]) to your Festival installation.

Make sure to adjust the paths in text2wave.bat, if you did not install Festival in C:\Festival.

Installing SoX

Under linux you will most likely have a pre-build package for SoX. Building from source is probably not required.

Binaries for Windows can be found on the SoX sourceforge page [13].

Specifying location of external tools

dramaTTS will try to determine the install locations of Festival and SoX automatically. This should most likely work under linux, if you installed the tools from the official packages (or put the location of the binaries in your PATH).

Under windows you will most likely have to define the tool locations manually.

To do that, just go to the preferences tab in the dramaTTS GUI and specify the file locations.

If you used the Festival version provided by the edGuideDog team the pre-compiled binaries are located in:

..Festival\src\main

After you specified a new tool location, you should save the preferences and restart dramaTTS to make the changes become effective.

Basic script format

dramaTTS’s script parser works with simple text files with minimum formatting.

General

  • Empty lines are ignored

  • in case of doubt a line will be assigned as narrative description

You can check how lines have been parsed, if you switch to the “Parsed lines”-mode text rendering mode in the “script” tab.

New scene

A new scene is indicated by a line starting with a number followed by a dot.

23. A new scene

The narrator will read the scene number and scene title.

Dialogue

A dialogue is indicated by a line giving only the speaker name in UPPER CASE letters - e.g.

BOB

Hi, I am Bob and this line is the my dialogue.

Bob's dialogue was quite short.

The next (non-empty) line after the dialogue-indicator (BOB) will be interpreted as the Bob’s dialogue text. A line break will end the Bob’s dialogue and the line following is interpreter as narrative description.

The narrator will say the speaker’s name and take over again after the speaker’s dialogue line is finished. You can easily check, who speaks the lines if you switch to the “Parsed lines”-mode text rendering mode in the “script” tab. The example above would be shown as:

Narrator: Bob
BOB: Hi, I am Bob and this line is the my dialogue.
Narrator: Bob's dialogue was quite short.

It is also possible to add narrative comments using parenthesis within a dialogue line:

BOB

I told you not to pull this lever! (shakes his head) Let's get the hell out of here!

In the example above, “shakes his head” will be spoken by the narrator -i.e. it would be rendered in “parsed lines”-mode as:

Narrator: Bob
BOB: I told you not to pull this lever!
Narrator: (shakes his head)
BOB: Let's get the hell out of here!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dramatts-0.1.2.tar.gz (61.6 kB view hashes)

Uploaded Source

Built Distribution

dramatts-0.1.2-py3-none-any.whl (53.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page