When animals (including humans) engage in volitional acts such as walking, running, and swimming—and in the case of humans, speaking, reading, and general focused thinking/learning—signals transmitted through the auditory cortex are suppressed (Tehovnik 2017; this behavioral state has been referred to as Type I behavior by Vanderwolf 1969). M2 of the rodent (i.e., mouse) motor cortex projects to both the auditory cortex and to the brain stem for the mediation of ultrasonic vocalizations (Nelson, Mooney et al. 2013). Once these neurons are activated during vocalization, the neurons in the auditory cortex are inhibited by way of local GABAergic interneurons that are activated by M2 afferents that preferentially innervate lamina I (which receives transcortical top-down input) and laminae V and VI that contain the cell bodies of descending cortical efferents (see Footnote 1). Accordingly, it is presumed that an efference copy of the vocalization command from M2 is sent to the auditory cortex to prevent an animal from confusing its own vocalization with a vocalization coming from conspecifics.
Following detailed unit recording (using tetrodes) from rodent (i.e., rat) M2 and M1, it was found that many neurons in these regions discharge during a 400-ms period of vocalization composed of an ultrasonic syllable of 40 to 70 kHz (Sharif et al. 2024). Furthermore, regions of the periaqueductal grey that are part of the vocalization network are found to receive projection from many neocortical areas including M2, M1, S1, A1, and the anterior cingulate cortex of both rodents and primates (Dujardin and Jürgen 2005; Sharif et al. 2024; also see Gan-Or and London 2023). As well, M2 simultaneously innervates the auditory cortex and periaqueductal grey via shared collaterals (Nelson, Mooney et al. 2013). It is likely that these regions mediate vocalization, but their respective roles are still being investigated. At the very least, M2, M1, S1, and A1 need to have their communications synchronized to coordinate the feedback from the vocal proprioceptors of the laryngeal muscles and the cochlea that transmits the uttered sound, once the motor neurons issue a command to vocalize. The web of interconnections between these regions supports this view (Nelson, Mooney et al. 2013; Sharif et al. 2024). Ultimately, association areas such as M2 store learned information that can be combined with information entering the brain via the senses to produce an optimal behavioral response (Tehovnik, Hasanbegović, Chen 2024).
Based on the foregoing, it follows that the cerebellum is programmed to potentiate auditory signals originating from self-vocalizations and to suppress vocalizations originating from other sources, which is derived from our understanding of cerebellar function (Tehovnik, Hasanbegović, Chen 2024). The specifics of how this is done awaits confirmation using advanced neuroscientific methods that can interrupt and potentiate specific neocortical-cerebellar loops in rodents (Hasanbegović 2024) as it pertains to vocalization across species (Fig. 2). This can guide our understanding of more advanced communication systems that exist in songbirds and humans (Kimura 1993; Kubikova et al. 2014; Malik-Moraleda, Fedorenko et al. 2022; Ojemann 1991; Penfield and Roberts 1966).
Based on Chomsky’s idea that there is a universal grammar for humans, it is now clear using both fMRI and information theory that all Homo sapiens utilized a common neural network of information flow as it pertains to spoken language (Coupé et al. 2019; Malik-Moraleda, Fedorenko et al. 2022, 2023). The issue of whether language is an invention based on human thinking or part of human evolution is being debated (Malik, Fedorenko et al. 2023). We believe that contrary to the views of Chomsky (2012) all animals have the capacity to think (Tehovnik, Hasanbegović, Chen 2024), which is consistent with the views of Pavlov (Michaud 2019). Furthermore, in the case of mammals there is an inherent structure to the neocortex such that lateral neocortical zones store object information and medial zones store information pertaining to the tracking of change, e.g., object motion (Tehovnik, Patel, Tolias et al. 2021). How this may have shaped thinking and human language (as well as mathematics) awaits clarification.
Footnote 1: By preparing the M2 axon terminals in the auditory cortex for optical activation in brain slices (Nelson, Mooney et al. 2013), such activation (intracellular) generated the characteristic firing profile of neocortical neurons (Logothetis et al. 2010; Tehovnik and Slocum 2013) in auditory cortex: excitation was followed by a long bout of inhibition once the light-pulse was terminated. The inhibition was mediated by GABAergic currents, and M2 was found to synapse onto the GABAergic neurons (PV interneurons, see Fig. 1) as well as onto the pyramidal neurons. When an auditory stimulus was paired with the optical stimulation of M2 axon terminals in an intact animal, the stimulation inhibited the auditory pyramidal neurons, but only when the sound followed the onset of optical stimulation.
Figure 1: Local GABAergic circuits within the neocortex for controlling the excitation of a pyramidal neuron (Pyramidal). The following GABAergic interneurons are indicated: parvalbumin-containing interneuron (PV), somatostatin-containing interneuron (SOM), and vasoactive intestinal polypeptide interneuron (VIP). The figure is based on V1 processing (Froudarakis et al. 2019).
Figure 2. The innervation pattern of the various communication systems of birds, lower mammals, and primates. A key feature to having a sophisticated communication system is to have the neocortical fibres innervate directly (and robustly) the nuclei controlling vocalization in the brain stem (Arriaga et al. 2012; Vanderwolf 2007), which is the case for songbirds and humans. These animals—unlike chickens, mice, and monkeys—have ability to learn associations between sounds and details pertaining to the external world, and in the case of mice the connection between the neocortex and vocal apparatus is very weak (Arriaga et al. 2012). Also, lesions of the motor cortex (M1 and Broca’s area/LMC in figure) in humans or area RA in songbirds (a M1 homologue) abolish the ability to generate learned vocalizations. Moreover, hummingbirds, parrots, bats, seals, sea lions, elephants, dolphins, and whales readily associate sounds with the details of the environment (Arriaga et al. 2012). From figure 1 of Arriaga et al. (2012). (auto_209.jpg)