摘要:In their recent article, Sweeny, Guzman-Martinez, Ortega, Grabowecky, and Suzuki (2012) demonstrate that heard speech sounds modulate the perceived shape of briefly presented visual stimuli. Ovals, whose aspect ratio (relating width to height) varied on a trial-by-trial basis, were rated as looking wider when a /woo/ sound was presented, and as taller when a /wee/ sound was presented instead. On the one hand, these findings add to a growing body of evidence demonstrating that audiovisual correspondences can have perceptual (as well as decisional) effects. On the other hand, they prompt a question concerning their origin. Although the currently popular view is that crossmodal correspondences are based on the internalization of the natural multisensory statistics of the environment (see Spence, 2011), these new results suggest instead that certain correspondences may actually be based on the sensorimotor responses associated with human vocalizations. As such, the findings of Sweeny et al. help to breathe new life into Sapir's (1929) once-popular “embodied” explanation of sound symbolism. Furthermore, they pose a challenge for those psychologists wanting to determine which among a number of plausible accounts best explains the available data on crossmodal correspondences.