

Thus, we questioned whether preferential cortical sensitivity to intrinsic human vocal tract sounds, those uniquely produced by human source-and-filter articulatory structures ( Fitch et al., 2002), could be revealed in earlier “low-level” acoustic signal processing stages closer to frequency-sensitive primary auditory cortices (PACs). However, by showing preferential superior temporal sulcus (STS) activity to artificial non-vocal sounds after perceptual training, recent studies consider these regions to be “higher-order” auditory cortices that function as substrates for more general auditory experience-contrary to these areas behaving in a domain-specific manner solely for vocalization processing ( Leech et al., 2009 Liebenthal et al., 2010). Voice-sensitive regions in humans have been traditionally identified bilaterally within the superior temporal sulci (STSs) ( Belin et al., 2000, 2002 Lewis et al., 2009). Cortical regions in several animals have been identified that are most sensitive to vocalizations produced by their own species (conspecifics) including some bird species, marmosets and cats, macaque, chimpanzee, and humans ( Belin et al., 2000 Tian et al., 2001 Wang and Kadia, 2001 Hauber et al., 2007 Petkov et al., 2008 Taglialatela et al., 2009). In early childhood, numerous communication disorders develop or manifest as inadequate processing of vocalization sounds in the CNS ( Abrams et al., 2009). Additionally, these findings have implications for the developmental time course of conspecific vocalization processing in humans as well as its evolutionary origins. Our results suggest that the cortical regions supporting vocalization perception are initially organized by sensitivity to the human vocal tract in stages before the STS. This left-lateralized hierarchy originated near primary auditory cortices and progressed into traditional speech-sensitive areas.

Using functional magnetic resonance imaging and a unique non-stereotypical category of complex human non-verbal vocalizations-human-mimicked versions of animal vocalizations-we found a cortical hierarchy in humans optimized for processing meaningful conspecific utterances. However, superior temporal sulcus (STS) regions have recently been reported to represent auditory experience or “expertise” in general rather than showing exclusive sensitivity to human vocalizations per se. In humans, the superior temporal sulci (STSs) putatively represent homologous voice-sensitive areas of cortex.

Numerous species possess cortical regions that are most sensitive to vocalizations produced by their own kind (conspecifics).
