5.2 Future
Festival is still very much in development. Hopefully this state will
continue for a long time. It is never possible to complete software,
there are always new things that can make it better. However as time
goes on Festival’s core architecture will stabilise and little or
no changes will be made. Other aspects of the system will gain
greater attention such as waveform synthesis modules, intonation
techniques, text type dependent analysers etc.
Festival will improve, so don’t expected it to be the same six months
from now.
A number of new modules and enhancements are already under consideration
at various stages of implementation. The following is a non-exhaustive
list of what we may (or may not) add to Festival over the
next six months or so.
- Selection-based synthesis:
Moving away from diphone technology to more generalized selection
of units for speech database.
- New structure for linguistic content of utterances:
Using techniques for Metrical Phonology we are building more structure
representations of utterances reflecting there linguistic significance
better. This will allow improvements in prosody and unit selection.
- Non-prosodic prosodic control:
For language generation systems and custom tasks where the speech
to be synthesized is being generated by some program, more information
about text structure will probably exist, such as phrasing, contrast,
key items etc. We are investigating the relationship of high-level
tags to prosodic information through the Sole project
http://www.cstr.ed.ac.uk/projects/sole.html
- Dialect independent lexicons:
Currently for each new dialect we need a new lexicon, we are currently
investigating a form of lexical specification that is dialect independent
that allows the core form to be mapped to different dialects. This
will make the generation of voices in different dialects much easier.