On the Representation of Speaker Information in Human Voices : An Adaptation Approach

Zäske, Romi GND

Apart from being carriers of speech, human voices contain a wealth of social signals, for instance about a speaker’s gender, identity, or age, to name but a few. The present thesis is concerned with the way adaptation modifies the perception of gender and identity information from voices. Adaptation is a mechanism by which neural responses decrease after continuous or repetitive stimulation. It is revealed by transient perceptual aftereffects indicating contrastive coding of simple and complex stimulus properties. The three studies reported here investigate unimodal and crossmodal auditory voice aftereffects of adaptation to unfamiliar and personally familiar speakers. Specifically, study I (Exp. 1) shows that adaptation to unfamiliar voices of female or male speakers biases the perception of voice gender away from the adapting gender: test voices, as created by auditory morphing between male and female voices, are perceived as more male following adaptation to female voices and vice versa. The voice gender aftereffect (VGAE) survived at least a few minutes and suggests the existence of voice detectors tuned to female and male voice quality. The absence of voice aftereffects following adaptation to names (Exp. 2), faces (Exp. 3), or sinusoidal tones matched to F0 of adaptor voices (Exp. 4) further suggests that the VGAE is due to habituation of high-level auditory representations. Study II replicates behavioural findings of study I (Exp. 1) and further supports the notion of processing selectivity for female and male voices by providing electrophysiological evidence. Systematic adaptation-induced amplitude reductions in AEPs (N1, P2, and P3) were observed in response to otherwise identical test voices when test voices and adaptors had the same gender as opposed to different genders. This suggests that contrastive coding of voice gender is implemented by auditory cortex neurons and takes place within the first few hundred milliseconds from voice onset. Similar to the VGAE, auditory aftereffects of adaptation to voices or faces of personally familiar speakers caused contrastive aftereffects in listeners’ perception of voice identity (study III). Unimodal voice-to-voice aftereffects (Exp. 1) were more pronounced and more persistent than crossmodal face-to-voice aftereffects (Exp. 2) pointing to at least two perceptual mechanisms of voice identity adaptation: one related to auditory coding of voice characteristics and one related to multimodal coding of speaker identity. These results complement findings in face perception (z.B. Leopold et al., 2001; Webster et al., 2004) and suggest that adaptation is a ubiquitous mechanism that routinely influences the perception of non-linguistic social information from both faces and voices.


Citation style:

Zäske, Romi: On the Representation of Speaker Information in Human Voices. An Adaptation Approach. 2011.

Access Statistic

Last 12 Month:

open graphic