Speech synthesis research by Ramon Llull University was oriented to improve the quality when the text input does not belong to a limited domain. In this sense, a new speech database in English was used, The database is the one used in the international Blizzard Challenge. The database consists on 10 hours of speech recorded for speech synthesis purposes. The voice is being used together with the voice conversion tools in a variety of experimental productions: MyTinyPlanets, i-VJ and Alan01. The new voice outperforms the quality of previous created voices.
The new version of the TTS system incorporates some improvements related to the research topics we are currently working on:
- Analysis and synthesis of Voice Quality parameters
- Prosody estimation of expressive speech
- Unit Selection Algorithms.
|Click here || for an online demo!|
In addition, we developed a web to be used by mobile phones such as an iPhone
, so the TTS is accessible from anywhere just with your phone.
Mobile version of the TTS web service (click image to access the interface)
In order to continuously improve our system we have developed an evaluation platform that can be used to evaluate any multimedia system. It is based on perceptual evaluations of selected stimulus. TRUE
(Testing platfoRm for mUltimedia Evaluation) is an online platform developed to create and perform subjective tests oriented to the evaluation of stimuli of different nature such as audio, video, graphics and text (Figure 16). Due to the high flexibility that the platform offers to researchers different kinds of tests can be carried out, such as emotion identification or quality assessment of synthesis systems, among others. The results can be used for different purposes depending on the research goals, e.g. to validate the emotional content of multimedia data of a corpus.
In the past International conference of the International Speech Communication Association, we presented a work done no automatic recognition of affective information. We participated in the “Emotion Challenge” which consisted of detecting the emotion elicited by a group of children while they were playing with the Aibo robot (Sony). We processed the children’s speech recorded while they were interacting with the robot and expressing a set of different emotions. The greatest challenge was that the speech was spontaneous and recorded in a real environment. Results have been promising since we performed in first and second position in the two modalities of the challenge.
TRUE allows evaluation by radio buttons and by means of a variety of plugins