A corpus of noise-induced word misperceptions for English


Abstract  Words spoken against a noise background often form an ambiguous percept. However, in certain conditions, a listener will mishear a noisy word but report hearing the same incorrect word as reported by other listeners. These consistent hearing errors are valuable as tests of detailed models of speech perception. This paper describes the collection of a corpus of consistent speech misperceptions for English. The mishearings were elicited using a large scale listening study involving 212 participants and over 300,000 token presentations. The study led to the identification of 3207 consistent misperceptions. For each of these, the corpus records the speech and masker waveforms that generated the error, the set of responses made by the listeners, and phonemic transcriptions of the target word and the response. The corpus is freely available online.

