Have you ever listened to a song, trying to understand what the words are, only to find yourself confused, or even shocked, about what the lyrics say given the context? Anecdotally, the most well-known misheard lyric is probably the one from Jimi Hendrix’s “Purple Haze”, where “scuse me while I kiss the sky” is often mistaken for “scuse me while I kiss this guy”. This misheard lyric is so famous that a website called KissThisGuy.com is dedicated to archive popular misheard lyrics.
There are many reasons why lyrics get misheard. Perhaps the singer didn’t pronounce the words clearly enough. Perhaps those misheard words are homophones to the original ones. Or, perhaps some musical components of the songs, like rhythm, intonation, and stress patterns, cause confusion, because they are not consistent with what we usually hear when those words are spoken. If the last case is true, it seems to imply that we rely on, to some extent, the “music” elements of sung or spoken sounds to make sense of where to draw the boundaries between words.
Indeed, these “music” elements, scientifically called “prosody”, are one of the essential cues for us to segment spoken streams into words, according to research. For example, most English content words start with a strong stress, like honey, so we tend to treat strongly-stressed syllables as the beginning of words. These prosodic cues are especially important for babies when they are learning a new language[4,5]. Think about this, when a baby doesn’t know a lot of words yet, the words in a sentence probably sound as if they are merging together into a big blob. But what might make a sentence like “kitty drinks milk” sound like three separate things, instead of one big chunk of “kittydrinksmilk”? It has a lot to do with the prominent prosodic cues in the sentence, for instance, “kitty/drinks/milk”, where the underlined and bolded parts usually sound louder, longer, and higher in pitch.
As we can see here, parsing continuous spoken streams into individual words is quite an important ability for us to understand the world. Without it, we can’t possibly figure out who is doing what to whom in a sentence. Research has shown that babies who are better at extracting words from continuous speech will also learn language better in general, say, knowing more words. If word segmentation is such an important ability, what happens to kids who are poor at this? Chances are, there are kids who are lost in speech streams who might feel similar to what we experience when we hear confusing song lyrics. For example, researchers have found that children with William Syndrome (a genetic disorder) are much delayed at building up their vocabulary, and it has to do with their difficulty in using stress patterns to parse speech. Moreover, deficits in spoken word segmentation may further result in serious issues like reading difficulties[8,9], which makes the kids’ lives more challenging in school and beyond.
So, what can we do about it? Since prosody, the “music” in speech, plays an important role in word segmentation, it seems intuitive to suspect that improving music-related ability could be helpful. This idea has been supported by research as well, where kids who had gotten music training did better in word segmentation of a new language than kids who hadn’t. Research like this could give us some hints about how to come up with treatment for children who struggle with language. Perhaps immersing and engaging babies with music at a young age can boost their long-term language outcome, and in turn giving rise to better quality of life. That said, more research is needed to understand more about the relationship among prosody, speech, and music. And we, as researchers, need our community to participate to help us move the research forward.
(This article was written as an informational outreach targeted at the general public.)
1. 25 Ridiculous Misheard Lyrics. at <http://www.clashmusic.com/feature/25-ridiculous-misheard-lyrics>
2. Did Jimi Hendrix really say, ‘’Scuse me, while I kiss this guy?’ at <http://www.kissthisguy.com/jimi.php>
3. Mattys, S. L., White, L. & Melhorn, J. F. Integration of Multiple Speech Segmentation Cues: A Hierarchical Framework. J. Exp. Psychol. Gen. 134, 477–500 (2005).
4. Johnson, E. K. & Jusczyk, P. W. Word Segmentation by 8-Month-Olds: When Speech Cues Count More Than Statistics. J. Mem. Lang. 44, 548–567 (2001).
5. Thiessen, E. D. & Saffran, J. R. When cues collide: use of stress and statistical cues to word boundaries by 7- to 9-month-old infants. Dev. Psychol. 39, 706–716 (2003).
6. Newman, R., Ratner, N. B., Jusczyk, A. M., Jusczyk, P. W. & Dow, K. A. Infants’ early ability to segment the conversational speech signal predicts later language development: a retrospective analysis. Dev. Psychol. 42, 643–655 (2006).
7. Nazzi, T., Paterson, S. & Karmiloff-Smith, A. Early Word Segmentation by Infants and Toddlers With Williams Syndrome. Infancy 4, 251–271 (2003).
8. Whalley, K. & Hansen, J. The role of prosodic sensitivity in children’s reading development. J. Res. Read. 29, 288–303 (2006).
9. Goswami, U., Gerson, D. & Astruc, L. Amplitude envelope perception, phonology and prosodic sensitivity in children with developmental dyslexia. Read. Writ. 23, 995–1019 (2010).
10. François, C., Chobert, J., Besson, M. & Schön, D. Music training for the development of speech segmentation. Cereb. Cortex 23, 2038–2043 (2013).