Festival

From ArchWiki
Jump to: navigation, search

Festival is a general multi-lingual speech synthesis system developed at CSTR (Centre for Speech Technology Research).

Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. As a whole it offers full text to speech through a number APIs: from shell level, though a Scheme command interpreter, as a C++ library, from Java, and an Emacs interface. Festival is multi-lingual (currently British English, American English, Italian, Czech and Spanish, with other languages available in prototype.)

Installation

Install festival from official repositories. You need a voice package like festival-english or festival-us. Further languages are available in official repositories and in AUR.

Test festival:

$ echo "This is an example. Arch is the best." | festival --tts

If your hear all the example text, you sucessfully installed a TTS system.

If you do not hear anything, see the Troubleshooting section. If you have a desktop system you will almost certainly get a message about /dev/dsp and need to follow those instructions.

Using German IMS festival extension with mbrola

You can use the festival-imsAUR package with IMS Stuttgart patches.

Tango-inaccurate.pngThe factual accuracy of this article or section is disputed.Tango-inaccurate.png

Reason: Much of the following should be done by the AUR package above, but mbrola and related packages should still need to be installed separately. (Discuss in Talk:Festival#)

The IMS of the University Stuttgart developed an extension to festival especially for German language. It uses German voices with mbrolaAUR. To install it, the extension needs to be downloaded from the university's servers (follow the Instructions here) and the PKGBUILD needs to be modified -- use abs to get it and the patches necessary to install festival under arch linux.

Add these two files downloaded from the IMS (do NOT use the third file, ims_german_1.3-os.fix.tgz)

 ims_german_1.3-os.tgz
 bomp_full.corr.tgz

with their md5sums to the source variable and place them in the same folder as the PKGBUILD. In the prepare() section, include the following lines (at the end of the section):

   # add ims config
   sed -i 's/ALSO_INCLUDE +=$/# IMS module for German\nALSO_INCLUDE += ims_german_text/' "$srcdir/festival/config/config.in"
   cat<<EOF >> "$srcdir/festival/lib/sitevars.scm"
 (set! mbrola-path "/usr/share/mbrola/")
 (set! mbrola_progname "/usr/bin/mbrola -e")
 EOF
   echo "(require 'ims_german_opensource)" >> "$srcdir/festival/lib/siteinit.scm"

This should install support for the german voices de1 through de4. Install at least one of these voices, e.g. mbrola-voices-de2AUR, and then use it in festival by selecting the voice via

 (voice_german_de2_os)
 (SayText "Hallo Welt.")

from the prompt or use it in text2wave via

 text2wave -o spoken.wav -eval '(voice_german_de2_os)' inputfile.txt

Configuration

There is no global /etc configuration file, but you can configure festival with your ~/.festivalrc file, or by directly editing /usr/share/festival/festival.scm. Both of these are scheme files, using scheme syntax and rerun everytime festival is run.

Usage with a Sound Server

For PulseAudio, add these lines to the end of your ~/.festivalrc file, or to /usr/share/festival/festival.scm:

(Parameter.set 'Audio_Required_Format 'aiff)
(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Command "paplay $FILE --client-name=Festival --stream-name=Speech")

For ALSA, use these lines instead (source):

(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Command "aplay -q -c 1 -t raw -f s16 -r $SR $FILE")

Voices

Arch splits the set of official voices into festival-english and festival-us. The AUR has some others, in various states of maintenance which may or may not be currently working.

To see what voices you currently have installed and what your default is, go into the Festival shell (which is a scheme REPL):

   $ festival
   
   Festival Speech Synthesis System 2.1:release November 2010
   Copyright (C) University of Edinburgh, 1996-2010. All rights reserved.
   
   clunits: Copyright (C) University of Edinburgh and CMU 1997-2010
   clustergen_engine: Copyright (C) CMU 2005-2010
   hts_engine: 
   The HMM-based speech synthesis system (HTS)
   hts_engine API version 1.04 (http://hts-engine.sourceforge.net/)
   Copyright (C) 2001-2010  Nagoya Institute of Technology
                 2001-2008  Tokyo Institute of Technology
   All rights reserved.
   For details type `(festival_warranty)'
   festival> voice_default 
   voice_cmu_us_slt_arctic_hts          ;;<-- THIS IS THE VOICE FESTIVAL SPEAKS WITH
   festival> default-voice-priority-list 
   (kal_diphone                         ;;<-- THIS IS THE HARD-CODED LIST OF VOICES FESTIVAL CAME PRE-AWARE OF
    cmu_us_bdl_arctic_hts
    cmu_us_jmk_arctic_hts
    cmu_us_slt_arctic_hts
    cmu_us_awb_arctic_hts
    ked_diphone
    don_diphone
    rab_diphone
    en1_mbrola
    us1_mbrola
    us2_mbrola
    us3_mbrola
    gsw_diphone
    el_diphone)
   
   festival> (voice_                    ;;<-- PRESS TAB HERE TO SEE WHAT VOICES FESTIVAL HAS AVAILABLE
   voice_cmu_us_slt_arctic_hts     voice_kal_diphone               voice_nitech_us_slt_arctic_hts  voice_reset
   voice_default                   voice_nitech_us_clb_arctic_hts  voice_rab_diphone
   festival> (voice_cmu_us_slt_arctic_hts) 
   cmu_us_slt_arctic_hts
   festival> (SayText "Arch makes me happy")
   #<Utterance 0x7fb5b8c423b0>
   festival> 

To permanently change the default voice you can add a line like this to the end of ~/.festivalrc:

(set! voice_default voice_cmu_us_slt_arctic_hts)

You cannot set the voice with festival.scm; to set voices globally, set order of searched voices in /usr/share/festival/voices.scm.

HTS compatibility patches

Tango-inaccurate.pngThe factual accuracy of this article or section is disputed.Tango-inaccurate.png

Reason: festival-us comes with cmu_us_slt_arctic_hts (Discuss in Talk:Festival#)

Some say that HTS voices for Festival are the best ones freely available. Sadly they are not compatible with Festival >2.1 without patching it (and the new voice versions are not made available for downloading).

You can install the patched version from AUR: festival-patched-htsAUR and festival-hts-voices-patchedAUR

Manual Voice Installs

You can also get voices straight from festvox.org. In their downloads, the files named "festvox_*.tgz" each contain a different voice, as built by the festival team. They do work, but you will need to manually unzip and move the folder containing the voice to the appropriate place. On a recent Arch, the appropriate place is /usr/share/festival/voices/english/ and the way to tell what folder contains the voice is to look for a 'festvox/' subfolder inside of it.

You can then test that your new voices are found by loading up the festival prompt again.

Usage

Read a text file:

$ festival --tts /path/to/letter.txt

Be obnoxious while demonstrating piping

$ (echo "Get ready for some pain"; sudo cat /var/log/messages.log) | festival --tts

Convert a text file to mp3:

$ cat letter.txt | text2wave | lame - file.mp3 && mplayer file.mp3

Interactive mode (testing voices etc.)

festival has an interactive prompt you can use for testing. Some examples (with sample output):

$ festival 
[...]
festival> 

List available voices:

festival> (voice.list)
(cstr_us_awb_arctic_multisyn kal_diphone don_diphone)

Set voice:

festival> (voice_cstr_us_awb_arctic_multisyn)
#<voice 0x1545b90>

Speak:

festival> (SayText '"test this is a test oh no a test bla test")
inserting pause after: t.
Inserting pause
[...]
id _63 ; name t ; 
id _65 ; name # ; 
#<Utterance 0x7f7c0c144810>

More:

festival> help 
"The Festival Speech Synthesizer System: Help

Quit: ctrl+d or

festival> (quit)

Example script

One classic app that can make use of this is ping. Use this script to constantly ping a host, and return ping if success, fail if not:

#!/bin/bash
while :; do
    ping -c 1 $1 && (echo "Ping" | festival --tts) || (echo "Fail" | festival --tts)
done

Note that this does not really work on multisynth voices, as they take a while to prepare before playing.

Troubleshooting

Can't open /dev/dsp

If festival returns the following error message:

Linux: can't open /dev/dsp

See #Usage with a Sound Server above.

Alsa playing at wrong speed

If the solution above gives you a squeaky voice, you might want to try changing your aplay options:

(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Command "aplay -Dplug:default -f S16_LE -r $SR $FILE")

aplay Command not found

Install alsa-utils.

See also