Skip to navigation | Skip to content

Speech & Hearing Science :: University of Illinois at Urbana-Champaign

The Department of Speech & Hearing Science
College of Applied Health Sciences

Ongoing Projects

Speech Perception Projects

The long-term goal of this translational project is to develop a set of multimodal tests of sentence recognition for adults and children. Speech is a multimodal signal. It has an auditory component (i.e., the acoustic waveform), and a visual component (i.e., the visible articulatory gestures generated during speech production such as lip rounding). Under natural conditions, listeners use both auditory and visual speech cues to extract meaning from speech signals.

The acoustic signal contains many sources of variability introduced by different talkers, dialects, speaking rates, and background noise. The addition of visual cues to the acoustic signal yields substantial gains in spoken word recognition (SWR), especially in adverse listening conditions [1][2][3][4]. Visual cues are particularly important because they help specify place of articulation, a speech feature that is fragile acoustically and often not accessible to many individuals with significant hearing loss [1] [5].

However, in most clinical settings, spoken word recognition performance has been assessed routinely using auditory-only presentation of monosyllabic word lists produced by a single talker using carefully articulated speech [6]. Such measures currently serve as the gold-standard for determining candidacy for, and/or benefit from, sensory aids such as cochlear implants.

There are a number of shortcomings with current spoken word recognition tests. Firstly, auditory-only spoken word recognition tests may not adequately characterize the performance of listeners with hearing loss. For example, although some adults and children with sensory aids demonstrate substantial auditory-only word recognition, others obtain high levels of speech understanding only when auditory and visual cues are available [7][8][9][10]. Furthermore, the ability to combine and integrate auditory and visual speech information has been found to be an important predictor of speech perception benefit with a sensory aid [9, 10][11,12] and thus has important implications for understanding the underlying representation and processing of speech in listeners who use these devices.

Secondly, tests that utilize speech materials from a single talker, in which variability is highly constrained, may not accurately reflect spoken word recognition abilities under more natural listening situations. Increasing stimulus variability by introducing multiple talkers or varying speaking rates reduces spoken word recognition performance in listeners with normal hearing [13] [14][15][16] and in listeners with hearing loss [7,17,18].

Thirdly, although most current clinical tests yield descriptive information about spoken word recognition, they reveal little about the underlying perceptual and cognitive processes employed by listeners with hearing loss. Our previous research suggests that the use of lexically-controlled stimulus items provides important new information about the way in which children and adults with hearing loss encode, store, organize and access spoken words from their mental lexicon [17, 19, 20].

Finally, and perhaps most importantly in the present project, normative data are lacking concerning multimodal SWR performance as a function of degree of hearing loss. Such standardization data are needed to guide decisions about sensory aid use and intervention strategies. Building upon a body of basic and clinical research concerning spoken word recognition by listeners with normal hearing or hearing loss, we propose to develop two new theoretically motivated, multimodal, multi-talker sentence tests of spoken word recognition for adults and children and to collect normative data from listeners with normal hearing and those with hearing loss. We believe these new norm-referenced tests will yield better assessment paradigms for selecting sensory aids and predicting patient response.

Literature Cited

  1. Erber NP: Auditory, visual, and auditory-visual recognition of consonants by children with normal and impaired hearing. Journal of Speech and Hearing Disorders 1972;15:413-422.
  2. Massaro DW, Cohen MM: Perceiving talking faces. Current Directions in Psychological Science 1995;4:104-109.
  3. Sumby WH, Pollack I: Visual contribution of speech intelligibility in noise. Journal of the Acoustical Society of America 1954;26:212-215.
  4. Summerfield Q: Some preliminaries to a comprehensive account of audio-visual speech perception. in Dodd B, Campbell R (eds): Hearing by Eye: The Psychology of Lipreading. Hillsdale, NJ: Lawrence Erlbaum Associates, 1987, 3-51.
  5. Walden BE, Grant KW, Cord MT: Effects of amplification and speech reading on consonant recognition by persons with impaired hearing. Ear and Hearing 2001;22:333-341.
  6. Martin FN, Champlin CA, Chambers JA: Seventh survey of audiometric practices in the United States. Journal of the American Academy of Audiology 1998;9:95-104.
  7. Kaiser AR, Kirk KI, Lachs L, Pisoni DB: Talker and lexical effects on audiovisual word recognition by adults with cochlear implants. Journal of Speech, Language, and Hearing Research 2003;46:390-404.
  8. Hay-McCutcheon M, Pisoni DB, Kirk KI: Audiovisual speech perception in elderly cochlear implant recipients. Laryngoscope 2005;115:1887-1894.
  9. Bergeson TR, Pisoni DB, Davis RAO: A longitudinal study of audiovisual speech perception by children with hearing loss who have cochlear implants. The Volta Review 2003;103:347-370.
  10. Bergeson TR, Pisoni DB, Davis RAO: Development of audiovisual comprehension skills in prelingually deaf children with cochlear implants. Ear and Hearing 2005;26:149-164.
  11. Lachs L, Pisoni DB, Kirk KI: Use of audiovisual information in speech perception by prelingually deaf children with cochlear implants: A first report. Ear and Hearing 2001;22:236-251.
  12. Bergeson TR, Pisoni DB: Audiovisual speech perception in deaf adults and children following cochlear implantation. in Calvert GA, Spence C, Stein BE (eds): The Handbook of Multisensory Perception. Cambridge, MA: MIT Press, 2004, 749-771.
  13. Mullenix JW, Pisoni DB, Martin CS: Some effects of talker variability on spoken word recognition. Journal of the Acoustical Society of America 1989;85:365-378.
  14. Mullenix JW, Pisoni DB: Stimulus variability and processing dependencies in speech perception. Perception & Psychophysics 1990;47:379-390.
  15. Nygaard LC, Sommers MS, Pisoni DB: Effects of speaking rate and talker variability on the representation of spoken words in memory. International Conference on Spoken Language Processing 1992.
  16. Sommers M, Nygaard L, Pisoni D: Stimulus variability and spoken word recognition I: Effects of variability in speaking rate and overall amplitude. The Journal of the Acoustical Society of America 1994;96:1314-1324.
  17. Kirk KI, Pisoni DB, Miyamoto RC: Effects of stimulus variability on speech perception in listeners with hearing impairment. Journal of Speech, Language, and Hearing Research 1997;40:1395-1405.
  18. Kirk KI: Cochlear implants: New developments and results. Current Opinion in Otolaryngology & Head and Neck Surgery 2000;8:415-420.
  19. Kirk KI, Pisoni DB, Osberger MJ: Lexical effects on spoken word recognition by pediatric cochlear implant users. Ear and Hearing 1995;16:470-481.
  20. Kirk KI: Lexical discrimination and perceptual normalization by listeners with cochlear implants. Conference on Implantable Auditory Prostheses 1997:14.

Speech Training for Adults with Normal Hearing

The speech signal carries two types of information: linguistic information (the message content) and indexical information (acoustic cues about the talker). In the traditional view of speech perception, the acoustic differences among talkers were considered "noise." According to this view, the listenerÕs task was to strip away unwanted variability to uncover the idealized phonetic representation of the spoken message.

A more recent view suggests that both talker and linguistic information are stored in memory. Rather than being unwanted "noise," talker information aids in speech recognition, especially under difficult listening conditions. For example, it has been shown that normal hearing listeners who completed voice training recognition were subsequently better at recognizing speech from familiar versus unfamiliar voices [1][2][3].

For individuals with hearing loss, access to both types of information may be compromised. Some studies have shown that cochlear implants (Cl) recipients are relatively poor at using indexical speech information because low-frequency speech cues are poorly conveyed in standard CIs [4]. However, some CI users with preserved residual hearing can now combine acoustic amplification of low frequency information (via a hearing aid) with electrical stimulation in the high frequencies (via the CI) [5][6]. It is referred to as bimodal hearing when a listener uses a CI in one ear and a hearing aid in the opposite ear. A second way electrical and acoustic stimulation is achieved is through a new CI system, the hybrid CI. This device combines electrical stimulation with acoustic hearing in the same ear by employing a shortened electrode array that is intended to preserve residual low frequency hearing in the apical portion of the cochlea. It may be that hybrid CI users can learn to use voice information to enhance speech understanding.

This study will assess voice learning and its relationship to talker-discrimination, music perception, and spoken word recognition in simulations of Hybrid CI or bimodal hearing.

Literature Cited

  1. Nygaard, L. C., Sommers, M. S., & Pisoni, D. B. (1994). Speech perception as a talker-contingent process. Psychological Science, 5(1), 42-46.
  2. Nygaard, L. C., & Pisoni, D. B. (1998). Talker-specific learning in speech perception. Perception & psychophysics, 60(3), 355-376.
  3. Krull, V., Luo, X., & Kirk, K. I. (2012). Talker-identification training using simulations of binaurally combined electric and acoustic hearing: Generalization to speech and emotion recognitiona). The Journal of the Acoustical Society of America, 131(4), 3069-3078.
  4. Dorman, M. F. , Gifford, R., Spahr, A. J. , and McKarns, S. (2008). ÒThe benefits of combining acoustic and electric stimulation for the recognition of speech, voice and melodies,Ó Audiol. Neuro-Otol. 13, 105Ð112. http://dx.doi.org/10.1159/000111782
  5. Turner, C. W., Gantz, B. J., Karsten, S., Fowler, J., & Reiss, L. A. (2010). The impact of hair cell preservation in cochlear implantation: combined electric and acoustic hearing. Otology & neurotology: official publication of the American Otological Society, American Neurotology Society [and] European Academy of Otology and Neurotology, 31(8), 1227.
  6. Holt, R. F., Kirk, K. I., Eisenberg, L. S., Martinez, A. S., & Campbell, W. (2005). Spoken word recognition development in children with residual hearing using cochlear implants and hearing aids in opposite ears. Ear and Hearing, 26(4), 82S-91S.
Copyright © College of Applied Health Sciences