An interesting question; I fear I have more questions than answers, but you have piqued my interest!
Research aside for a moment, as an avid music lover and a player of multiple instruments with an academic background in cognitive psychology, my intuition and experience leads me to think that syntactic and most perceptual processing of music are indeed processed separately, in parallel. (A hunch--temporal aspects of the music play a role in the syntax of music.) At encoding, they represent different levels of processing. I also hypothesize that they draw upon different memory "systems" as well.
As Beatrice notes, musical structure is processed similarly to language. According to Brown et al (2006), PET scans show similar patterns of brain activity during the production of sentences and of melodies, with some differences in lateralization. If the analogy to language holds, one would expect the timber of the instrument to have little effect on the parsing of the melody, just as different voices have little impact on sentence processing; e.g,, see Pisoni et al 1987.
As you allude to, one could imagine utilizing a divided attention paradigm in experimental investigation of your query (experience with music as a covariate would be nice.) Also, think-aloud protocols and retrospective verbal reports from subjects might prove insightful.
It also might be interesting to track people who are learning an instrument, or expose people to an entirely new kind of music and see if there are differentials in processing music syntax and sound as people move from "novice" to "expert.
[[[[ Brown, Steven; Martinez, Michael J.; Parsons, Lawrence M. (2006). "Music and language side by side in the brain: A PET study of the generation of melodies and sentences". European Journal of Neuroscience 23 (10): 2791–803. ]]]]
Thank you both for your answers! I am familiar with the top paper, but hadn't thought to link it to answering this question - an interesting idea perhaps to see timbre as the 'prosody' or 'intonation' of music...
What a great answer Oliver! You have given me lots to think about - a divided attention task could definitely be a great way forward, and I would agree that perhaps the two streams are processed separately but in parallel. Again, a divided attention task would be interesting here.
You raise an interesting point that they may use different memory systems. Perhaps a syntactic working memory system for the structure, and a different memory system for the timbre - I wonder how that would fit in. I have heard of a tonal loop but that would presumably serve the pitch element (which of course involves structure)... I will definitely look into this further! Presumably there would be a very strong connection between the two in any case.
I wonder if there are any cases of individuals who are able to process the pitch/structure, but are unable to differentiate between different instruments or sources of information? This would be interesting too.
Allow me to inject some educated speculation, from several perspectives: some general cognitive system observations, personal experience as a choral singer, and from some biologically inspired cognitive engineering work.
First, from Stanislaus Dehaene's work (particularly his Reading In the Brain book), I find his argument compelling that reading reuses processing pathways originally evolved for other purposes. Indeed, written language necessarily developed to use existing visual (and phonetic) processing capabilities to distinguish letters and symbols and groups of them - a written language that does not effectively use these built-in mechanisms could never have succeeded and spread. No doubt there are multiple mechanisms favored by evolution that may be engaged in parallel/combination among different modes of communication: symbolic/written, verbal language, and music.
Second, from my experience as a choral singer, it is quite clear that there are distinct "channels" to learning new music. Rhythm and cadence and phrasing are distinct from pitch and harmony and distinct from pronunciation and articulation and several other finer dimensions. Not totally independent, but still quite separable. Effective learning tends to separate these channels, learning each and then combining them. Concentrating on learning one channel (e.g., just repeating nonsense syllables to the rhythm with no pitch) emphasizes that aspect but does not seem to diminish the other aspects - it does not feel like any sort of attentional competition.
Third, my own research work is on composition of reusable neural-inspired components to build cognitive systems. I find it frequently convenient to process multiple aspects of sensory or other inputs in parallel leading eventually to fusion/recombination. One almost trivial example is a "name that tune" cognitive circuit with three distinct paths. One path takes typed characters as a melody name. Another path computes an abstract invariant sequence code signature for a melody. And a third path records actual timings, pitches and volumes. All three representations are effectively synonyms or aspects of the one melody concept, serving different but related purposes. The abstract invariant sequence code purposely loses information so that similar melodies perhaps with mistakes and in different keys/styles/tempos map to the same code for easy recognition. The high fidelity recording is for reproducing the melody as well as for more detailed comparisons e.g., as in learning to sing/play the melody or recognize a particular nuanced performance. And the name is for reporting and referencing. This general architecture does indeed suggest that there are distinct "memory systems" involved. There is ample evidence in biological brains of distinct areas dedicated to such different aspects for the different senses, including from selectively brain-damaged patients (Sorry, I don't have references, just from general reading.)
I expect a similar "architecture" with more paths/dimensions is alive with music processing, as well as with verbal language and other sensory domains. The specific timbre of an instrument is a combination of acoustic energy at different harmonics as represented by the notch frequency filters of our cochlear hair cells. There is certainly some early auditory processing that allows us to hear the very different timbres of very different instruments as all playing the same pitch, irrespective of volume and the finer points of expression (e.g., attack, decay, sustain, release, etc.) I suspect there are multiple such channels, effectively synchronized by time, and recombinations of these aspects further along in our auditory systems.
From your original question, the issue of attention may be somewhat orthogonal and perhaps mixed. On one hand, I expect that there are many largely bottom-up feed-forward "always on" perceptual paths that operate in parallel with no attentional/focal arbitration. (See Kahneman's "Thinking Fast and Slow".) On the other hand, there are "active sensing" mechanisms where feedback paths direct the focus of senses (e.g., focusing your eyes on objects of interest, turning our heads toward sounds). Some amount of attentional focus seems to operate at higher cognitive levels: our senses are operating in parallel all the time and we are adjusting the signal gains at various points in these processing paths to "pay attention" to some things more than others. Finally there are various biochemical adjustments to signal gains, even at very low perceptual levels, such as fight-or-flight hormones, sleep, anesthesia, even meditative discipline.
Sorry this has been a bit of a ramble. Hope it helps, at least for some perspective.
If you are interested in quantitative content, this is a good primer on fMRI and biblio from the NIH. Pretty focused on the N400 event related potential (ERP) though.
Kutas, M., & Federmeier, K. D. (2011). Thirty years and counting: Finding meaning in the N400 component of the event related brain potential (ERP). Annual review of psychology, 62, 621.
There is also some discussion of a N500 ERP,
Featherstone, C. R., Morrison, C. M., Waterman, M. G., & MacGregor, L. J. (2013). Semantics, syntax or neither? A case for resolution in the interpretation of N500 and P600 responses to harmonic incongruities. PloS one, 8(11), e76600.
Interesting question. Is there research that suggests that musical syntax is processed separately from aspects of music like timbre?
I have not read work that explicitly relates to that, but I can comment from my own research area, which explores cognition in music performance. Like Lee's post above, my research explores aspects of music making along various dimensions. I base these explorations in the "fundamental mechanisms of joint activities" that underlie language use (Herbert H. Clark, 1996). My work is also grounded in WAM performance. This means that I have gone to great pains to make sure that the ontology I use in scientific research on music making is consistent with the ontology I use to understand music in practice and performance (as a bassoonist).
I guess that is really my main point. When we are looking to understand music making, we have to ask, what is music? What concepts are we using to understand our data? If we assume that music is knowable through its syntax, then it makes sense to look for research that explores the idea of syntax in language. Thats where I can't help you. But if we view music making as a joint activity, then we can ask interesting questions about how people coordinate their attention around aspects of music performance.
As a performer, you will know that you can play one phrase of music an infinite number of ways. In fact with the same set of notes, in the same rhythm and in the same order, you can create an entire composition - a performance with shaping, with movement, with expectation, with cadence. That "meaningful evocation" is based in our ability to manipulate timbre, articulation, dynamics, and so on.
I would be tempted to turn your question on its head and ask, is there any research that indicates that the fundamental mechanisms that support musical activity also underlie other joint activities like language? and if so, what ideas do we have about the origins of those fundamental mechanisms? I might be tempted to speculate that these have evolved from social activities around the efforts of survival - we work together in the hunt, in the preparation and enjoyment of food, in the washing of clothes. These coordinations lead to meaningful communication... anyway, best of luck with your research.
Thank you all again for your answers! I find it really interesting the many links being drawn with language here, even though the question was not about language specifically. I am currently researching commonalities in the processing of music and language syntax, so this is really interesting and great to see the number of links being made!
I see a lot of evidence suggesting strong parallels and similarities between the processing of music and language, and perhaps this is yet another - that we process perceptual aspects of music/language separately/in parallel/ through different streams as the syntactic aspects.
To my mind, the most convincing evidence that musical syntax and the perceptual, acoustic properties of sounds are processed by different levels of neural organization is the impaired fluency and accuracy of recall that the great majority of musicians experience in attempting to perform atonal music without reference to the score, by comparison with tonal compositions of a similar or greater order of motoric difficulty. I can cite the following two research paper investigating this impaired recall in the cases of two autistic savant pianists, both possessing exceptional replication of extremely dissonant isolated chords and "phonographic" memory for tonal compositions. These impairments correspond to those revealed in studies of master chess-players (Chase & Simon,1973; Gobet & Simon,1996a), who, when shown examples of board positions from actual games were able to accurately recall these in full, but were incapable of such recall when presented with boards on which pieces had been randomly positioned.
An exceptional musical memory
JA Sloboda, B Hermelin, N O'Connor - Music perception, 1985 - JSTOR
Another exceptional musical memory
A Ockelford, in Deliege & Davidson (eds): Music and the Mind, Oxford University Press, 2011
Thanks for these references - very interesting indeed. I wonder how much the structure is tied into the tonality of the music itself - for example in studies using 'out-of-key' chords to disrupt syntactic structure (and finding interactions with language)?
E.g. Slevc, L. R., Rosenberg, J. C., & Patel, A. D. (2009). Making psycholinguistics musical: self-paced reading time evidence for shared processing of linguistic and musical syntax. Psychon Bull Rev, 16(2)
The autistic savant pianists appear more able to reproduce 'standard' structural pieces, suggesting that tonality is an important aspect of syntax (as opposed to say rhythmic syntax...). I wonder if this also differentiates standard phonological working memory (NP in the first reference had a low digit recall), from a form of 'syntactic' working memory...
Interesting comparison of music to chess. I'm not sure that the kind of cognitive processing studied by Chase & Simon (recalling the positions of chess pieces on a board, in real game and not real game scenarios) will say much about musical activity. And I'm not sure that the choice of "savant" pianists (problematic label) will generalize to musical activity more broadly. Your idea seems to be that "recall" is a measure for coherence, and that coherence is stronger in tonal music, when played by savant pianists. Therefore, musical 'syntax' must be based in tonality and any other aspects of music making must be activating regions not related to the processing of tonality. A bit of a stretch, but one that is often enough made.
BTW: A great variety of musicians perform a great variety of musics in a great variety of situations and activities. We are not all experiencing problems with non-tonal musics, and we are well equipped to perform with and without musical scores. I really care about the kind of data people take to make these claims about music and musicians, and the warrant that these kinds of claims are based on.
Why must the processing of timbre interfere with the processing of pitch classes?
When we speak to each other, we use inflection, timbre, emphasis, etc. as a way of negotiating meaning. The same word spoken differently will have a different meaning in an exchange. Context determines how we shape our spoken utterances (HH Clark, 1996), and I suggest the same is true with musical utterances.
Meaning, in language and music, is multimodal (*by multimodal I mean, 'drawing on more than one sensory modality in communication', not 'using multiple musical modes'). Non-tonal music encourages us to connect materials meaningfully using modalities other than 'syntax' (I prefer to use the term tonality). For example, a piece can be "about" dynamic shaping, articulation, gesture, timbre, pacing, inflection, vibrato; a performance can be "about" the way someone else performed it yesterday (see Ingrid Monson's work, and also Kaastra's work on layered meaning in music performance). From the lessons of modernity (in Western Art Music) we turn back to tonal music with a greater awareness and mastery of those tacit aspects of musical activity. The same awareness and integration of those aspects occurs in tonal and non-tonal music making.
Timbre is an aspect of music that we have some physical control over - in the case of wind instruments by changing air speed/embouchure system. Timbre is an aspect of performance that can be attended to focally or that can sit in our subsidiary awareness as we attend to other aspects (Kaastra). A musician can attend focally to timbre, or to syntax, or to something else. A musician can attend focally to some extra-musical idea that brings aspects in our subsidiary awareness (e.g. timbre and syntax) together into a meaningful evocation. We need more complex models to understand how all that looks at the neuronal level. But more importantly, we need to ask the kinds of questions that are grounded in the realities of music making, not based on misleading assumptions about what we think musicians ought to be thinking about.
Thanks for your reply - and for the interesting citation you included. Having tracked it down, I went on to do a quick search of NCBI on musical syntax, and am wondering if any of the returned results would be helpful to you. Here's the link:
I've been set thinking about your reference to "rhythmic syntax", a term that isn't familiar to me. What interests me about it is whether rhythm, though an obvious aspect of musical structuring that contributes to coherence, legitimately qualifies for the status of "syntax". One of the key implications of a syntax, to my understanding, is that it automatically induces chunking of syntactically connected elements common to a particular level of cognitive representation. By contrast, the kind of temporally paced chunking that permits the emergence of rhythmic structuredness requires special deliberation - the purposeful entraining of anticipated motor acts (and the various modalities of perceptual imagery informing them) to an independently imposed pace, typically in the form of some counting-regime. In other words, rhythmic organization is seemingly not dependent upon and emergent from acquired syntactic rules or reqularities, but upon control by a superordinate (executive) level and mechanism of structuring.
Of course, that's just my "take" on the question. I'd very much welcome reading your own, and naturally those of anyone else following this thread.
A very interesting question indeed - is there such a thing as rhythmic syntax?
Here's an interesting article I found about the topic: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3813894/
It makes the argument that 'harmonic' syntax is receiving a lot of attention, however there are other types of musical syntax that are not (especially in this case rhythmic syntax). Perhaps because they don't warrant the term 'syntax', or perhaps because they're harder to experimentally control...
You can certainly have violations of rhythmic structure, though whether it's 'hierarchically organised' in the way that language is is another question. There are certainly smaller groupings within larger groupings, including phrase boundaries, and I would believe it to be infinitely generative... I haven't looked into it much myself either, but I think it's definitely an interesting area to follow up on.