I believe that music sounds like people, moving. Yes, the idea may sound a bit crazy, but it’s an old idea, much discussed in the 20th century, and going all the way back to the Greeks. There are lots of things going for the theory, including that it helps us explain (1) why our brains are so good at absorbing music (…because we evolved to possess human-movement-detecting auditory mechanisms), (2) why music emotionally moves us (…because human movement is often expressive of the mover’s mood or state), and (3) why music gets us moving (…because we’re a social species prone to social contagion). And as I describe in detail in my upcoming book – "Harnessed: How Language and Music Mimicked Nature and Transformed Ape To Man" – music has the signature auditory patterns of human movement (something I hint at here http://www.science20.com/mark_changizi/music_sounds_moving_people ).

Here I’d like to describe a novel way of thinking about what the meaning of music might be. Rather than dwelling on the sound of music, I’d like to focus on the look of music. In particular, what does our brain think music looks like?

It is natural to assume that the visual information streaming into our eyes determines the visual perceptions we end up with, and that the auditory information entering our ears determines the events we hear. But the brain is more complicated than this. Visual and auditory information interact in the brain, and the brain utilizes both to guess the single scene to render a perception of. For example, the research of Ladan Shams, Yukiyasu Kamitani and Shinsuke Shimojo at Caltech have shown that we perceive a single flash as a double flash if it is paired with a double beep. And Robert Sekuler and others from Brandeis University have shown that if a sound occurs at the time when two balls pass through each other on screen, the balls are instead perceived to have collided and reversed direction. These and other results of this kind demonstrate the interconnectedness of visual and auditory information in our brain. Visual ambiguity can be reduced with auditory information, and vice versa. And, generally, both are brought to bear in the brain’s attempt to infer the best guess about what’s out there.

Your brain does not, then, consist of independent visual and auditory systems, with separate troves of visual and auditory “knowledge” about the world. Instead, vision and audition talk to one another, and there are regions of cortex responsible for making vision and audition fit one another. These regions know about the sounds of looks and the looks of sounds. Because of this, when your brain hears something but cannot see it, your brain does not just sit by and refrain from guessing what it might have looked like. When your auditory system makes sense of something, it will have a tendency to activate visual areas, eliciting imagery of its best guess as to the appearance of the stuff making the sound. For example, the sound of your neighbor’s rustling tree may spring to mind an image of its swaying lanky branches. The whine of your cat heard far way may evoke an image of it stuck up high in that tree. And the pumping of your neighbor’s kid’s BB gun can bring forth an image of the gun being pointed at Foofy way up there.

Your visual system has, then, strong opinions about the proper look of the things it hears. And, bringing ourselves back to music, we can use the visual system’s strong opinions as a means for gauging music’s meaning. In particular, we can ask your visual system what it thinks the appropriate visual is for music. If, for example, the visual system responds to music with images of beating hearts, then it would suggest, to my disbelief, that music mimics the sounds of heartbeats. If, instead, the visual system responds with images of pornography, then it would suggest that music sounds like sex. You get the idea.

But in order to get the visual system to act like an oracle, we need to get it to speak. How are we to know what the visual system thinks music looks like? One approach is to simply ask which visuals are, in fact, associated with music? For example, when people create imagery of musical notes, what does it look like? One cheap way to look into this is simply to do a Google (or any search engine) image search on the term “musical notes.” You might think such a search would merely return images of simple notes on the page. However, that is not what one finds. To my surprise, actually, most of the images are like the one in the nearby figure, with notes drawn in such a way that they appear to be moving through space. Notes in musical notation never actually look anything like this, and real musical notes have no look at all (because they are sounds). And yet we humans seem to be prone to visually depicting notes as moving all about.



Could these images of notes in motion be due to a more mundane association? Music is played by people, and people have to move in order to play their instrument. Could this be the source of the movement-music association? I don’t think so, because the movement suggested in these images of notes doesn’t look like an instrument being played. In fact, it is common to show images of an instrument with the notes beginning their movement through space from the instrument: these notes are on their way somewhere, not an indication of the musician’s key-pressing or back-and-forth movements.

Could it be that the musical notes are depicted as moving through space because sound waves move through space? The difficulty with this hypothesis is that all sound moves through space. All sound would, if this were the case, be visually rendered as moving through space, but that’s not the case. For example, speech is not usually visually rendered as moving through space. Another difficulty is that the musical notes are usually meandering in these images, but sound waves are not meandering – sound waves go straight. A third problem with sound waves underlying the visual metaphor is that we never see sound waves in the first place.

Another possible counter-hypothesis is that the depiction of visual movement in the images of musical notes is because all auditory stimuli are caused by underlying events with movement of some kind. The first difficulty, as was the case for sound waves, is that it is not the case that all sound is visually rendered in motion. The second difficulty is that, while it is true that sounds typically require movement of some kind, it need not be movement of the entire object through space. Moving parts within the object may make the noise, without the object going anywhere. In fact, the three examples I gave earlier – leaves rustling, Foofy whining, and the BB gun pumping – are noises without any bulk movement of the object (the tree, Foofy, and the BB gun, respectively).  The musical notes in imagery, on the other hand, really do seem to be moving, in bulk, across space.

Music is like tree-rustling, Foofy, BB guns and human speech in that it is not made via bulk movement through space.  And yet music appears to be unique in this tendency to be visually depicted as moving through space. In addition, not only are musical notes rendered as in motion, musical notes tend to be depected as meandering.

When visually rendered, music looks alive and in motion (often along the ground), just what one might expect if music’s secret is that it sounds like people moving.

A Google Image search on “musical notes” is one means by which one may attempt to discern what the visual system thinks music looks like, but another is to simply ask ourselves what is the most common visual display shown during music. That is, if people were to put videos to music, what would the videos tend to look like?

Lucky for us, people do put videos to music! They’re called music videos, of course. And what do they look like? The answer is so obvious that it hardly seems worth noting: music videos tend to show people moving about, usually in a time-locked fashion to the music, very often dancing.

As obvious as it is that music videos typically show people moving, we must remember to ask ourselves why music isn’t typically visually associated with something very different. Why aren’t music videos mostly of rivers, avalanches, car races, wind-blown grass, lion hunts, fire, or bouncing balls? It is because, I am suggesting, our brain thinks that humans moving about is what music should look like…because it thinks that humans moving about is what music sounds like.

Musical notes are rendered as meandering through space. Music videos are built largely from people moving, and in a time-locked manner to the music. That’s beginning to suggest that the visual system is under the impression that music sounds like human movement. But if that’s really what the visual system thinks, then it should have more opinions than simply that music sounds like movement. It should have opinions about what, more exactly, the movement should look like. Do our visual systems have opinions this precise? Are we picky about the mover that’s put to music?

You bet we are! That’s choreography. It’s not enough to play a video of the Nutcracker ballet during Beatles music, nor will it suffice to play a video of the Nutcracker to the music of Nutcracker, but with a small time lag between them. The video of human movement has to have all the right moves at the right time to be the right fit for the music. 


These strong opinions about what music looks like make perfect sense if music mimics human movement sounds. In real life, when people carry out complex behaviors, their visual movements are tightly choreographed with the sounds – because the sight and sound are due to the same event. When you hear movement, you expect to see that same movement. Music sounds to your brain like human movement, which is why when your brain hears music, it expects that any visual of it should be consistentwith it. 


This was adapted from Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man (Benbella Books,2011).