A number of research positions are available from 1st September 2010 at the University of the Basque Country (Spain) to work on the LISTA (The Listening Talker) project funded by the European Union Framework 7 FET-Open Programme until April 30 2013.


About the LISTA project

Speech is efficient and robust, and remains the method of choice for human communication. Consequently, speech output is used increasingly to deliver information in automated systems such as talking GPS and live-but-remote forms such as public address systems. However, these systems are essentially one-way, output-oriented technologies that lack an essential ingredient of human interaction: communication. When people speak, they also listen. When machines speak, they do not listen. As a result, there is no guarantee that the intended message is intelligible, appropriate or well-timed. The current generation of speech output technology is deaf, incapable of adapting to the listener’s context, inefficient in use and lacking the naturalness that comes from rapid appreciation of the speaker-listener environment. Crucially, when speech output is employed in safety-critical environments such as vehicles and factories, inappropriate interventions can increase the chance of accidents through divided attention, while similar problems can result from the fatiguing effect of unnatural speech. In less critical environments, crude solutions involve setting the gain level of the output signal to a level that is unpleasant, repetitive and at times distorted. All of these applications of speech output will, in the future, be subject to more sophisticated treatments based on understanding how humans communicate.

The LISTA project will investigate how listeners modify their production patterns in realistic environments that are characterised by noise and natural, rapid interactions. By listening while talking, speakers can reduce the impact of noise and reverberation at the ears of their interlocutor. And by talking while listening, speakers can indicate understanding, agreement and a range of other signals that make natural dialogs fluid and not the sequence of monologues that characterise current human-computer interaction. Both noise and natural interactions demand rapid adjustments, including shifts in spectral balance, pauses, expansion of the vowel space, and changes in speech rate.

While our knowledge of the way speakers adapt to their context is currently far from complete, some data exists as well as speculation about perceptual consequences. Speakers respond to the presence of noise by reallocating information-bearing elements in the spectrum and across time, in such a way as to increase intelligibility for listeners. When asked to speak clearly, talkers respond with articulatory gestures that result in greater cross-class separation and small within-class variation. Faced with competing talkers, speakers make on-the-spot prosodic alterations to reduce foreground-background overlap. Listeners provide speakers with immediate feedback using “back-channel” utterances (e.g., “yeah”, “sure”), as well as facial gestures, which speakers monitor and exploit by speeding up, slowing down or changing style. In short, talkers have at their disposal a host of strategies deployed at a range of temporal scales to promote successful speech communication. The goal of LISTA is to obtain a deeper understanding of human context-sensitivity in speech production and to translate some of these findings to synthetic, live and recorded speech.

This multidisciplinary project lasts for 3 years and is coordinated by the Language and Speech Laboratory at University of the Basque Country, Spain. Other partners are the Centre for Speech Technology Research at the University of Edinbugh (UK), the Sound and Image Processing Laboratory at KTH (Stockholm, Sweden) and the Networks and Telecommunications Laboratory at the Institute for Computer Science (Crete, Greece).

Positions. Two post-doctoral research positions are available to start on 1st September 2010.  A PhD in speech communication in humans or machines is essential. Experience in the design, execution and analysis of speech perception experiments is desirable. Salary is in the range €30-35k.

Application procedure. Please send your CV to the project coordinator, Prof. Martin Cooke by Friday 16th July 2010.