Scholarly article on topic 'Reconciling time, space and function: A new dorsal–ventral stream model of sentence comprehension'

Reconciling time, space and function: A new dorsal–ventral stream model of sentence comprehension Academic research paper on "Psychology"

Share paper
Academic journal
Brain and Language
OECD Field of science
{"Language comprehension" / "Dorsal stream" / "Ventral stream" / "Hierarchical organisation" / "Cognitive control" / Syntax / Semantics / "Inferior frontal gyrus" / "Anterior temporal lobe" / "Posterior temporal lobe"}

Abstract of research paper on Psychology, author of scientific article — Ina Bornkessel-Schlesewsky, Matthias Schlesewsky

Abstract We present a new dorsal–ventral stream framework for language comprehension which unifies basic neurobiological assumptions (Rauschecker & Scott, 2009) with a cross-linguistic neurocognitive sentence comprehension model (eADM; Bornkessel & Schlesewsky, 2006). The dissociation between (time-dependent) syntactic structure-building and (time-independent) sentence interpretation assumed within the eADM provides a basis for the division of labour between the dorsal and ventral streams in comprehension. We posit that the ventral stream performs time-independent unifications of conceptual schemata, serving to create auditory objects of increasing complexity. The dorsal stream engages in the time-dependent combination of elements, subserving both syntactic structuring and a linkage to action. Furthermore, frontal regions accomplish general aspects of cognitive control in the service of action planning and execution rather than linguistic processing. This architecture is supported by a range of existing empirical findings and helps to resolve a number of theoretical and empirical puzzles within the existing dorsal–ventral streams literature.

Academic research paper on topic "Reconciling time, space and function: A new dorsal–ventral stream model of sentence comprehension"


Contents lists available at SciVerse ScienceDirect

Brain & Language

journal homepage:

Reconciling time, space and function: A new dorsal-ventral stream model of sentence comprehension

Ina Bornkessel-Schlesewskya,*1 Matthias Schlesewskyb

a Department of Germanic Linguistics, University of Marburg, Marburg, Germany b Department of English and Linguistics, Johannes Gutenberg-University, Mainz, Germany



Article history:

Accepted 15 January 2013

Available online 26 February 2013


Language comprehension Dorsal stream Ventral stream Hierarchical organisation Cognitive control Syntax Semantics

Inferior frontal gyrus Anterior temporal lobe Posterior temporal lobe

We present a new dorsal-ventral stream framework for language comprehension which unifies basic neurobiological assumptions (Rauschecker & Scott, 2009) with a cross-linguistic neurocognitive sentence comprehension model (eADM; Bornkessel & Schlesewsky, 2006). The dissociation between (time-dependent) syntactic structure-building and (time-independent) sentence interpretation assumed within the eADM provides a basis for the division of labour between the dorsal and ventral streams in comprehension. We posit that the ventral stream performs time-independent unifications of conceptual schemata, serving to create auditory objects of increasing complexity. The dorsal stream engages in the time-dependent combination of elements, subserving both syntactic structuring and a linkage to action. Furthermore, frontal regions accomplish general aspects of cognitive control in the service of action planning and execution rather than linguistic processing. This architecture is supported by a range of existing empirical findings and helps to resolve a number of theoretical and empirical puzzles within the existing dorsal-ventral streams literature.

© 2013 Elsevier Inc. All rights reserved.

1. Introduction

The literature on the neuroscience of language has recently seen an increasing interest in the dorsal and ventral streams as possible, neurobiologically plausible streams of speech and language processing.1 Evidence for this perspective has been gleaned from a number of different domains, ranging from speech perception and production (Hickok & Poeppel, 2004, 2007; Rauschecker & Scott, 2009) over word-level production and comprehension (Ueno, Saito, Rogers, & Lambon Ralph, 2011) to syntactic processing (Friederici, 2009). An inherent appeal of the dual streams perspective is that it may help to provide a neurobiological grounding for functionally motivated models of the language architecture. In particular, as dual streams of processing are well established within the literature on the auditory system of non-human primates, they open up the possibility for highly appealing cross-species comparisons between hu-

* Corresponding author. Address: Department of Germanic Linguistics, University of Marburg, Deutschhausstrasse 3, 35032 Marburg, Germany. Fax: +49 (0)6421 2824558.

E-mail address: (I. Bornkessel-Schlesewsky).

1 Here and in the following, we use the term "stream" to denote a functional route of information processing and the term "pathway" to denote neuroanatomical connectivity. As will become clear throughout the remainder of the paper, we view the correspondence between the two as correlative in nature. Crucially, this implies that there need not be a 1:1 correspondence between streams and pathways (including, for example, a many-to-one mapping of pathways to streams).

0093-934X/$ - see front matter © 2013 Elsevier Inc. All rights reserved.

man speech and language and more general properties of auditory processing (Rauschecker & Scott, 2009).

However, in spite of the relatively unified neuroanatomical perspective underlying these current dual streams approaches to language (but see below for differing assumptions regarding possible neuroanatomical sub-pathways and the characterisation of posterior temporal regions), their interpretations of dorsal and ventral stream functions in language processing are quite different from one another. For example, based on studies of pseudoword production versus sentence comprehension, Saur et al. (2008, p. 18035) proposed that the dorsal stream mediates the ''sensory-motor mapping of sound to articulation", while the ventral stream is involved in the ''linguistic processing of sound to meaning''. By contrast, Friederici (2009, 2012) draws upon results from sentence comprehension to posit that part of the dorsal stream (specifically, one dorsal sub-pathway) is crucial for the processing of ''hierarchical'' or ''complex'' syntax, whereas part of the ventral stream (one ventral sub-pathway) is assumed to be involved in the processing of ''local'' or ''simple'' syntax. Clearly, these alternative functional proposals have very different implications for the interpretation of the dorsal and ventral streams during language processing and, thereby, for models of the neurobiology of language. However, beyond these specific interpretations, are there possible unifying (and meaningful) functional generalisations that dissociate one stream from the other, irrespective of the possible existence of neuroanatomical sub-pathways?

Here, we approach this question from a novel perspective. Specifically, we attempt to bring together some basic neurobiological design principles regarding information processing within the two streams (Rauschecker & Scott, 2009) with insights on the functional architecture of sentence comprehension. We will argue that the assumption of hierarchical processing - the sensitivity for increasingly complex sets of feature combinations within neurons or neuronal assemblies - as a basic principle of brain function within the auditory system (as suggested by Rauschecker, 1998)2 can be fruitfully combined with well-established assumptions regarding the timing of language comprehension. This assumed correspondence between a neuroanatomical hierarchy and a temporal hierarchy in information processing will be used as a basis for a new spatio-temporal model of language processing within a dorsal and ventral streams perspective.

The remainder of the paper is organised as follows. Section 2 begins by introducing some background assumptions from the neurobiological domain and sentence comprehension in time and space. Section 3 subsequently goes on to describe some puzzles that arise if these background assumptions are adopted. Section 4 offers a possible solution to the puzzles described in Section 3 in the form of a novel proposal regarding the neuroanatomical locus of syntactic structure building and the form-to-meaning mapping at the sentence level. Finally, Section 5 offers some conclusions.

2. Background assumptions

In this section, we will describe the assumptions on which our line of argumentation will be based in the following sections. While each of these assumptions can, presumably, be contended at some level, it seems to us that they are all established sufficiently to warrant their use as premises of the account to be developed here.

2.1. Hierarchical organisation as a basic property of functional neuroanatomy

On the basis of research on the visual (e.g. Felleman & Van Essen, 1991) and, more recently, auditory systems (Rauschecker, 1998), we follow Rauschecker and Scott (2009) in assuming that the functional neuronanatomy of information processing in the brain is hierarchically organised:

Hierarchical organization in the cerebral cortex combines elements of serial as well as parallel processing: 'lower' cortical areas with simpler receptive-field organization, such as sensory core areas, project to 'higher' areas with increasingly complex response properties, such as belt, parabelt and PFC regions. These complex properties are generated by convergence and summation [...]. Parallel processing principles in hierarchical organization are evident in that specialized cortical areas ('maps') with related functions (corresponding to sub-modalities or modules) are bundled into parallel processing 'streams'. (Rauschecker & Scott, 2009, p. 719)

Evidence for hierarchical organisation within the auditory system stems from a variety of different sources. Using single cell recordings in non-human primates (rhesus monkeys), Rauschecker and colleagues found increasing sensitivity to more complex ''auditory objects'' - from neurons responding mainly to specific frequency bandwidths in lateral auditory belt areas to neurons

2 Note that hierarchical processing in this sense is not to be confused with hierarchical syntax in the sense of Friederici (2009). Friederici uses the term "hierarchical" to refer to particular types of syntactic structures, in contrast to the neurobiological sense that is central here. For more detailed discussions of the two senses of the term "hierarchical", see Sections 2 and 3 for the neurobiological and Friederician sense, respectively.

responding increasingly to species-specific vocalisations in more anterior portions of the superior temporal gyrus (Rauschecker, Tian, & Hauser, 1995). From these ''increasing proportions of call-selective neurons [...] from A1 to lateral belt to more anterior superior temporal areas'' (Rauschecker, 1998, p. 518), Rauschecker proposed a hierarchical organisation of auditory processing that is compatible with what is known about hierarchical processing within the visual system. This perspective was recently corroborated by a meta-analysis of neuroimaging studies on language processing, in which DeWitt and Rauschecker (2012) found evidence for an anterior-directed processing gradient within temporal cortex. Across 115 studies, phoneme versus word processing engendered increasingly anterior activation within the superior temporal gyrus (STG), and phrase-level processing correlated with activation in the anterior superior temporal sulcus (STS). From these findings, DeWitt and Rauschecker (2012) argue for a concordance between the results on human language processing and the literature on primate auditory processing, with both providing evidence for hierarchical processing of auditory objects within a ventral processing stream in superior temporal cortex.

Applyingthese basic assumptions to language processing - and, for present purposes, sentence processing in particular - we arrive at the following hypothesis: the functional neuroanatomy of the form-to-meaning mapping should be characterised by neuroanatomical gradients originating in primary auditory areas, which correlate with the processing of successively more complex linguistic units.

2.2. Time-space correspondence

If we accept the premise that language processing is supported by a hierarchically organised auditory system (see Section 2.1), this also has implications for the temporal organisation of the form-to-meaning mapping in sentence processing. Of course, connectivity within the brain is inherently bidirectional. Nevertheless, the assumption of hierarchical organisation implies that there is a certain asymmetry in the ''flow'' of information, since ''lower'' areas with simple feature sensitivity project to ''higher'' areas with a sensitivity to more complex stimuli, resulting from the convergence and summation of properties from a number of ''lower'' areas (see the quote by Rauschecker & Scott, 2009, in Section 2.1 above). DeWitt and Rauschecker (2012, p. E509), too, refer to ''a processing cascade emanating from core areas, progressing both laterally, away from core itself, and anteriorly, away from A1'' in describing their ventral stream of linguistic pattern recognition (i.e. language comprehension from the phonemic to the phrasal level). We thus propose that insights on the organisation of the neuroanatomical processing hierarchy should be compatible with findings on the temporal organisation of sentence processing and vice versa.

This hypothesis can be exemplified using the gradient of phonemic processing to word processing that was observed by DeWitt and Rauschecker (2012): electrophysiological (i.e. scalp EEG) studies of auditory word recognition in sentence context have provided evidence for two mismatch-related negativities that occur when the current input is incongruent with the prior sentence or discourse context, an N200 and a following N400 (e.g. Connolly & Phillips, 1994; van den Brink, Brown, & Hagoort, 2001). Based on these results, even proponents of a highly interactive ''one-step'' model of sentence-level interpretation (Hagoort, 2005; Hagoort & van Berkum, 2007) have argued for a cascade of information processing during sentence comprehension (Hagoort, 2008; van den Brink, Brown, & Hagoort, 2006; van den Brink et al., 2001). According to this view, word recognition comprises the activation of a cohort of word candidates (Marslen-Wilson, 1987; Marslen-Wilson & Welsh, 1978) in a strictly bottom-up manner, with N200 effects emerging whenever a form-based lexical candidate is not supported by the current

context (in contrast to when it is). The N400, by contrast, is thought to index a mismatch at the content level only, which can take place when the meaning of a word is to be integrated with the prior context (Hagoort, 2008). In addition to showing a clear temporal parallel to the neuroanatomical gradient discussed above, these findings also make clear that the assumption of a processing hierarchy or cascade does not contradict the central notion of bidirectionality. For both the N200 and the N400, mismatches with top-down information (i.e. information provided via feedback from higher-level contextual representations) are crucial. Nevertheless, the hierarchical organisation from smaller (less complex) to larger (more complex) units is reflected in the relative timing of the electrophysiological signals.3

In this paper, we examine the consequences of applying this assumed ''time-space correspondence" based on the notion of hierarchical processing to more general aspects of the form-to-meaning mapping at the sentence level. To this end, we will draw upon insights gleaned from our existing neurocognitive model of language comprehension, the extended Argument Dependency Model (eADM: Bornkessel & Schlesewsky, 2006; Bornkessel-Schlesewsky & Schlesewsky, 2008a, 2009b), which assumes a cascaded organisation of the linguistic form-to-meaning mapping. The eADM appears well suited to this purpose for at least two reasons: (a) the assumption of an incremental, cascaded architecture which serves to combine smaller units into larger ones draws upon aspects of both serial and parallel processing and is thereby conceptually very similar to the neuronanatomical notion of hierarchical processing discussed in Section 2.1; (b) the model is shaped by cross-linguistic considerations (i.e. the question of which properties of the language processing architecture generalise across the >6000 languages currently spoken and which are specific to individual languages) and thereby closely tied to neurobiology, since cross-linguistically recurring properties are likely grounded in some way in the structure and function of the human brain (Bornkessel-Schlesewsky & Schlesew-sky, in press-a, in press-b).

2.3. The functional neuroanatomy of (auditory) language processing4

In accordance with the assumption that the dual-streams perspective provides a neurobiologically plausible basis for models

3 Note that this notion of a time-space correspondence cannot be examined directly using current experimental techniques (given the temporal insensitivity of BOLD-fMRI and the inverse problem in EEG/MEG). It can, however, potentially be achieved at the model level such that stages of information processing posited in one of the two domains should also be applicable in the other. Importantly, this does not imply that the absolute peak latencies of typical language-related ERP components (e.g. the N400 or the N200) or the relative temporal distance between the latencies of two components (e.g. N400 vs. P600) should be viewed as timing estimates for processing times in particular brain regions/networks or the transfer of information between them. Evidence for the assumption that ERP component latencies do not reflect absolute processing times stems from several domains. On the one hand, eye movement research has shown that a particular linguistic phenomenon (e.g. word frequency) can be reflected in the eye movement record during reading at an earlier point in time than in the ERP record (Rayner & Clifton, 2009; Sereno & Rayner, 2003). On the other hand, research within the mismatch negativity paradigm also indicates that many linguistic information sources already appear to be available at a considerably earlier point in time than is suggested by typical language-related ERP components such as the N400 (Pulvermuller, 2010). Nevertheless, the assumption that the relative timing of different ERP components may provide evidence for the hierarchical organisation of processing, has not been contradicted to date (for discussion, see Bornkessel-Schlesewsky & Schlesewsky, 2009a). Thus, even though ERP effects constitute rather macroscopic brain responses and thereby provide evidence on a rather different level to other data types used to inform the neurobiological dorsal and ventral streams literature (e.g. single cell recordings in non-human primates), we posit that they are nevertheless informative with regard to the functional (hierarchical) architecture of processing.

4 We confine our assumptions to auditory sentence processing in the present paper for the sake of simplicity, since visual processing involves processing pathways from other primary sensory regions. We would, however, assume that the basic mechanisms described here carry over to processing in the visual domain, though some neuroanatomical modifications will clearly be required.

of speech and language processing, the following discussion will presuppose that the distinction between a dorsal and a ventral processing stream provides us with a feasible functional model of information transfer in the brain during the form-to-meaning mapping at the sentence level. Furthermore, we posit that the dorsal versus ventral distinction is meaningful at a functional level, i.e. that there is a common denominator in terms of function for the dorsal versus ventral stream irrespective of whether or not there are additional anatomical sub-pathways within each stream (e.g. Catani, Jones, & ffytche, 2005; Friederici, 2012; Glasser & Rilling, 2008). This means that there should be differences in the type(s) of information transferred along each stream and/or in the mechanisms of information processing. Of course, this is only a hypothesis (for an opposing view, which posits heterogeneous functions within each stream, see Friederici, 2011, 2012), but - assuming that the terms ''dorsal'' and ''ventral'' are to remain meaningful at a functional as well as a neuroanatomical level - it appears worth pursuing for reasons of parsimony.

With regard to our neuroanatomical assumptions, two further clarifications are in order. Firstly, for the anatomical grounding of the two streams, we draw upon Rauschecker and Scott's (2009) notion of an ''antero-ventral'' and a ''postero-dorsal'' pathway, respectively. Crucially, this implies that the anatomical pathways are not confined to long-distance projections from temporal to frontal regions (via fibre bundles such as the arcuate fascicle or the extreme capsule), but already originate within the temporal lobe and - as demonstrated in monkeys - even within auditory cortex (Rauschecker & Scott, 2009; Tian, Reser, Durham, Kustov, & Rauschecker, 2001). Thus, in accordance with hierarchical processing, we hypothesise that neuroanatomical processing gradients should be observable within temporal cortex, ''emanating'' from primary auditory regions (for evidence regarding similar connectivity between primary auditory areas (Heschl's gyrus) and anterior and posterior temporal regions, respectively, in humans, see Upadhyay et al., 2008). These should show an anterior directionality within the antero-ventral pathway on the one hand and a posterior directionality within the postero-dorsal pathway on the other.

Secondly, in accordance with the tenets of hierarchical processing, there is an asymmetry in the directionality of information transfer in spite of the obvious presence of bidirectional connections. From this perspective, it appears legitimate to refer to streams/pathways ''emanating'' from auditory cortex and projecting to frontal regions. Throughout the remainder of the paper, we will therefore sometimes use terms such as ''upstream'' and ''downstream'' to refer to regions closer to and further away from primary auditory areas, respectively, within the processing streams/pathways (for a similar terminology, see DeWitt & Rauschecker, 2012; Mesulam, 1998).

2.4. The temporo-spatial hypothesis of the dorsal and ventral streams

Taking the basic assumptions outlined in Sections 2.1-2.3 together, we arrive at the following basic research hypothesis:

(1) The temporo-spatial hypothesis (TSH) of the dorsal and ventral streams

Language processing involves the spread of activation along the dorsal and ventral streams. The two streams are instantiated anatomically by an antero-ventral and a postero-dorsal pathway, respectively, both of which emanate from primary auditory regions and project - via anterior temporal and posterior temporal/parietal regions, respectively - to frontal cortex. The streams serve separable functions of information processing and a

unifying function can be defined for each stream, irrespective of the possible presence of multiple anatomical subpathways. Information sources that are taken into account earlier in time during processing are processed further upstream in neuroanatomical terms than information sources that are taken into account later.

The TSH posits that the assumption of hierarchical processing, which has been demonstrated for the antero-ventral pathway in both the primate auditory system and in human speech and language processing (DeWitt & Rauschecker, 2012; Rauschecker, 1998; Rauschecker & Scott, 2009; Rauschecker et al., 1995), also applies to the postero-dorsal pathway during language comprehension. In other words: the dorsal stream, too, has the function of combining elements/features to form successively more complex representations, but the nature of these representations differs fundamentally from those constructed within the ventral stream. This is, to the best of our knowledge, a novel claim, since existing approaches stress the role of the dorsal stream in auditory-motor mappings and, thereby, particularly speech production (Hickok & Poeppel, 2004, 2007; Saur et al., 2008; Ueno et al., 2011). Rauschecker and Scott (2009) do envisage the dorsal pathway as supporting speech processing in general (i.e. involving both production and perception, switching between forward and inverse models in auditory-motor linkage), but do not provide a detailed specification of comprehension mechanisms. In Friederici's account, which assumes that one dorsal sub-pathway (and, in her view, functional sub-stream) is involved in the comprehension of syntactically complex sentences (Friederici, 2009, 2011, 2012), this pathway is viewed essentially as a top-down connection: ''the dorsal back-projection from BA 44 to posterior STG/STS [...] subserve[s] top-down processes relevant for the assignment of grammatical relations'' (Friederici, 2012, p. 263).

In the remainder of the paper, we will provide a twofold motivation for the assumption of hierarchical processing within the dorsal stream in sentence comprehension - which will likely prove to be one of the more controversial claims advanced here. Firstly, we will discuss a number of existing puzzles in the dorsal-ventral streams literature which, in our view, challenge at least certain aspects of current accounts (Section 3). Secondly, we will argue on the basis of existing findings on sentence comprehension across languages that, in addition to the hierarchically organised representations processed within the ventral stream, the form-to-meaning mapping requires a second type of representation which crucially involves the ordering of elements in time, and that the dorsal stream provides a neurobiologically plausible architectural locus for the processing of this second information type (Section 4).

3. Puzzles

3.1. The locus of syntax within a dorsal-ventral streams architecture

All current psycholinguistic and neurolinguistic models of sentence processing essentially agree upon two things: (a) syntactic rules/representations are involved in sentence processing; and (b) syntactic information is taken into account early during the comprehension process (e.g. Bornkessel & Schlesewsky, 2006; Fra-zier & Clifton, 1996; Friederici, 2002; Hagoort, 2005; MacDonald, Pearlmutter, & Seidenberg, 1994; Pulvermüller, 2010; Vosse & Kempen, 2000). Note that this applies not only to so-called ''syntax-first'' models, but also to interactive or constraint-based models (e.g. Hagoort, 2005; MacDonald et al., 1994; Vosse & Kempen, 2000): while the latter posit that non-syntactic information sources are taken into consideration at the same time as syntactic information, they do not assume that syntactic processing is de-

layed vis-à-vis other information types.5 At the same time, and in spite of differing fundamentally with regard to the details, most current neurocognitive models of language processing at the sentence level assume that Broca's region (i.e. the pars opercularis and triangularis of the left inferior frontal gyrus, lIFG) is somehow involved in syntactic processing (Baggio & Hagoort, 2011; Friederici, 2002, 2009; Hagoort, 2003, 2005; Ullman, 2001, 2004). Friederici (2012, p. 265) even refers to BA 44 as the ''core syntax region''.

In Section 2.2, we proposed that the dorsal-ventral streams perspective in conjunction with the well-established assumption of hierarchical processing leads to a notion of ''time-space correspondence". If true, and assuming that the psycholinguistic results regarding the status of syntax as a relatively basic and early information source hold, the perspective that inferior frontal cortex is crucially involved in syntactic structure-building appears somewhat surprising. Frontal cortex (including inferior frontal regions, but also other frontal areas such as premotor cortex) constitutes the point of convergence between the two streams and is thereby essentially the furthest possible point downstream from primary auditory cortex (PAC) within each stream (and even if one does not accept our proposal regarding hierarchical processing during sentence comprehension within the dorsal stream, this still holds true for the ventral stream).6 By contrast, time-space correspondence would lead one to predict that syntax is processed in networks that are still relatively far upstream within the processing streams and thereby quite close to primary sensory cortices.

The problems arising from an association between syntax and inferior frontal cortex have, of course, already been noted. Thus, a number of scholars have argued that Broca's region is engaged in processes of cognitive control during language processing, rather than in linguistic computation per se (e.g. Stowe, Haverkort, & Zwarts, 2005; Stowe et al., 1998; Thompson-Schill, Bedny, & Goldberg, 2005; Thompson-Schill, D'Esposito, Aguirre, & Farah, 1997). These arguments have been extended specifically to syntax in studies demonstrating an increase of lIFG activation for sentences involving ambiguity and, hence, an increased need to select among competing alternatives (Novick, True-swell, & Thompson-Schill, 2005). It has further been demonstrated that this activation overlaps with activation elicited in a classic cognitive control paradigm (the Stroop task; January, Trueswell, & Thompson-Schill, 2009). Based on differing inferior frontal activation patterns for distinct types of word order variations, we have also recently argued (Bornkessel-Schlesewsky, Grewe, & Schlesewsky,

5 The early timing of syntactic information is apparent, for example, from ERP studies on sentence comprehension. These have revealed that, when a syntactic problem is encountered at an earlier point in time in the speech stream (e.g. induced via a prefix) than a following semantic error (e.g. induced via the word stem), the ERP response to the semantic error is modulated in comparison to when it occurs on its own (Hahne & Friederici, 2002). By contrast, when the temporal availability of the two information sources is reversed (e.g. when the semantic problem is induced via the word stem and the syntactic problem via a suffix), the ERP response to the syntactic error is comparable to that elicited when the syntactic error occurs on its own (van den Brink & Hagoort, 2004). This asymmetry provides compelling evidence in favour of a cascaded information processing architecture, with syntactic (i.e. category sequence) information available early within the cascade (for a more detailed exposition of this argument, see Bornkessel-Schlesewsky & Schlesewsky, 2009a). Note that we motivate the cascade in terms of the modulating influence of one information type on another rather than in terms of absolute ERP component latencies, since typical language-related ERP components such as the N400 likely do not provide accurate absolute timing estimates (see Footnote 3). Importantly, work within the mismatch negativity (MMN) paradigm also supports the assumption of cascaded processing (Pulvermüller, Shtyrov, & Hauk, 2009), though at considerably shorter latencies (with phonological, lexical and syntactic, as well as semantic information - in that cascaded order - modulating ERP responses within the first 200 ms post word onset). Nevertheless, as MMN studies require a rather artifical language processing paradigm with many repetitions of minimally varying standard and deviant stimuli, we would also be cautious in adopting these latency values as estimates of absolute timing in natural language processing.

6 Note again that this assumption does not contradict the bidirectionality of the streams (see Section 2.2 for detailed discussion).

Fig. 1. Schematic depiction of the model proposed here. Panel A provides a basic overview of the neuroanatomical assumptions: both the ventral (dashed line) and dorsal (solid line) streams are assumed to emanate from primary auditory cortex (PAC) and to perform information processing in a hierarchically organised manner. Thus, though the streams are bidirectional, there is an inherent asymmetry in the directionality of information flow on account of the hierarchical organisation. Note that the figure abstracts away from possible neuroanatomical sub-pathways within the two streams. Panel B shows the assumed structure of hierarchical processing within the two streams. For further details on actor-event (AE) schema activation/identification, see Fig. 2. For further details on AE-schema unification, see Fig. 3. For further details on syntactic structure building, see Fig. 4. Note 1: Control functions of inferior frontal cortex in language comprehension are assumed to be organised in a functional-neuroanatomical gradient; for a more detailed discussion of the inner structure of the processing architecture in inferior frontal cortex, see Bornkessel-Schlesewsky et al. (2012) and Bornkessel-Schlesewsky and Schlesewsky (2012).

2012; Bornkessel-Schlesewsky & Schlesewsky, 2010,2012; Schlesewsky & Bornkessel-Schlesewsky, 2013) that the phenomena summarised by Friederici (2009) under the label ''hierarchical syntax'' are more parsimoniously explained in terms of a gradient of cognitive control in prefrontal cortex in the sense of Koechlin and colleagues (Koech-lin, Ody, & Kouneiher, 2003; Koechlin & Summerfield, 2007). This control-based perspective on lIFG function during language processing is perfectly compatible with (a) hierarchical processing, since control provides a natural overarching functional interface between linguistics representations and behaviour; and (b) the assumed time-space-correspondence, since it implies that ''true'' syntactic computation must take place further upstream than in inferior frontal cortex.

If we are to look for syntax further upstream, the crucial question is, of course, where. In this regard, the anterior temporal lobe (aTL) appears to be the best candidate given the current state of the art in the field. However, since the aTL forms part of the ventral

stream, we shall discuss this issue in more detail as part of Puzzle number 2 in the following subsection.

3.2. What is the role of the ventral stream: syntax, concepts or both?

In the existing literature, there appear to be two main views on the functional significance of the ventral stream during language processing. Motivated by observations at the sentence and word level, respectively, some researchers have emphasised the importance of basic combinatorics, while others have focused on the extraction of meaning from a linguistic form. Since both of these views crucially hinge upon the role of the aTL, we will also focus primarily upon this region in the following.

The aTL has long been under discussion as a possible locus of structuring operations at the sentence-level, since it shows increased activation for sentences versus word lists (Bottini et al., 1994; Friederici,

Meyer, & von Cramon, 2000; Mazoyer et al., 1993; Stowe et al., 1998; Xu, Kemeny, Park, Frattali, & Braun, 2005). More recent findings indeed appear to support a more specific role of the aTL in sentence-level combinatorics as demonstrated, for example, by contrasting minimal linguistic phrases with word pairs not allowing for syntactic/semantic composition (Bemis & Pylkkanen, 2011) or by assessing parametric changes in brain activity associated with measures of syntactic complexity (Brennan et al., 2012). Accordingly, Hickok and Poeppel (2007) assume that the anterior middle temporal gyrus (aMTG) and anterior inferior temporal sulcus (alTS) as subregions of the aTL constitute a ''combinatory network'' that forms part of the ventral stream. In Friederici's (2009) model, one ventral stream/pathway, including the aTL, is thought to be involved in ''local'' syntactic combinatorics, amounting essentially to the building of phrases (e.g. a noun phrase such as ''the hamster'' from a determiner and a noun) (Friederici, 2009).

From a different perspective to the one just discussed, the ventral stream has been proposed as a stream for extracting semantics in language understanding (e.g. Saur et al., 2008; Scott, Blank, Rosen, & Wise, 2000; Ueno et al., 2011; and, to a certain degree, Hickok & Poeppel, 2007). From the perspective of a neurocomputational dorsal-ventral streams model, for example, Ueno et al. (2011) state:

Given its proximity to the semantic-based representations of the vATL, the functioning of the ventral pathway becomes dominated by the input m semantic m output mappings which are doubly computationally challenging in that the mappings are both arbitrary in form and require transforming between time-varying (acoustic-phonology-motor) and time-invariant (semantic) representations [...]. (Ueno et al., 2011, p. 392)

Evidence for this concept or comprehension based view of the ventral stream stems (a) from robust findings of semantic dementia associated with lesions to or atrophy of the aTL (see Ueno et al., 2011); and (b) from the observation of increased ventral stream involvement in language comprehension as opposed to production (Saur et al., 2008).

How, then, do these different - and seemingly incompatible -perspectives on the ventral stream fit together? At least two types of suggestions have been put forward in this regard:

(a) The ''parallel solution''. In Friederici's (2009, 2011, 2012) view, which equates functional streams with anatomical (sub-)pathways, the apparent incompatibility is resolved by the assumption that there is no one ''ventral stream'' with a unified function in speech and language processing. Rather, Friederici (2009, 2011, 2012) suggests that there are two ventral streams/pathways, one of which (''ventral pathway I'') engages in semantic processing (connecting the mid MTG/STG to anterior IFG via the extreme capsule; Saur et al., 2008), while the other (''ventral pathway II'') supports local syntactic structure building (connecting the aTL to the FOP via the uncinate fascicle). In this approach, the two ventral streams/pathways thus support the form-to-meaning mapping by performing semantic and syntactic computations, respectively.7

7 It is, however, not clear how Friederici's anatomical assumptions regarding parallel ventral pathways for syntactic and semantic processing map onto the temporal assumptions of her processing model. With regard to the time course of processing, local syntactic structure building (''simple syntax'') is assumed to constitute a functional prerequisite for semantic processing (Friederici, 1999, 2002, 2011). Thus, information processing in ventral pathway I (semantics) must build upon the computations accomplished by ventral pathway II (local syntax). However, Friederici (2012) does not appear to envisage ventral pathway II as bidirectional (see her Fig. 1, in which ventral pathway II, the connection between aSTG and FOP, is one of the very few connections that is depicted as unidirectional). This conflict between the temporal and neuroanatomical dimensions could be resolved by assuming that feedback from frontal cortex to the aTL via ventral pathway II is a prerequisite for semantic processing in ventral pathway I. However, this would require giving up the assumption of a hierarchical organisation in functional-neuroanatomical terms.

(b) The ''sequential solution''. Hickok and Poeppel (2007) suggest that the ventral stream consists of a ''lexical interface'' in the posterior temporal lobe (pMTG/pITS) followed by ''combinatorial processing'' in the aTL (aMTG/aITS). They thus posit that both lexical/conceptual and combinatory processing take place within the ventral stream, but that they do so in successive processing steps. Note, however, that Hickok and Poeppel's anatomical definition of the ventral pathway is somewhat different to that in other proposals and to the view assumed here in that they assume an initial projection from primary auditory areas to posterior temporal regions within this stream prior to information transfer to the aTL.

However, neither of these potential solutions succeeds in finding a possible common denominator between combinatorics and conceptual processing and, thus, in defining a unifying functional interpretation of the ventral stream. As already noted in Section 2.3, it is of course an empirical question whether there is indeed such a common denominator (and Friederici's approach clearly negates this assumption). Nevertheless, as outlined above, we consider it a worthy enterprise to examine whether it might be possible to formulate such a common functional denominator. One could, of course, attempt to bridge the different domains by positing a very broad unifying function such as ''language comprehension'' (as opposed to production via the dorsal stream). Yet, an explanation along these lines in turn raises some potentially concerning questions about the role of the dorsal stream in language comprehension. These will be the subject of the next puzzle.

3.3. (Why) do we need a dorsal stream for language comprehension?

Returning to the sequential and parallel solutions for the ventral stream that were discussed in the last subsection, both approaches have in common that they situate both combinatory and semantic processing within the ventral pathway. Thus, in spite of their differences, the two accounts face a similar problem: (Why) do we need a dorsal pathway for language comprehension? If both combinatory and conceptual aspects of the comprehension process are handled within the ventral stream, then this should be sufficient for comprehension in general. Additional processes appear superfluous.

Prima facie, Friederici's parallel account of the ventral stream offers a potential solution to this problem: The dorsal stream (more precisely, dorsal pathway II8) only comes into play during the comprehension of syntactically complex sentences (i.e. sentences involving either (an) embedded clause(s) or deviations from the basic word order). However, this approach is subject to both conceptual and empirical problems. Conceptually, how does the processing system determine whether a sentence is ''simple'' or ''complex'' syntactically? Is there a threshold which controls whether a particular sentence or part of a sentence should be processed via the ventral stream only or whether dorsal stream involvement is required and, if so, how and where is this decision made? In the latest version of Friederici's account (Friederici, 2012), the burden of these ''decisions'' appears to be placed on BA 44, which is described as playing ''a particular role in creating argument hierarchies as a sentence is computed'' and as the ''core syntax region'' (Friederici, 2012, p. 265). From this perspective, BA 44 would seem to provide the crucial interface between simple and complex syntax on

8 As for the ventral stream, Friederici assumes two functionally and anatomically separable dorsal sub-pathways. Dorsal pathway I, which connects the posterior temporal lobe to the premotor cortex via the arcuate fascicle (AF)/superior longitudinal fascicle (SLF) corresponds to the dorsal stream assumed to perform auditory-motor mappings in other accounts (e.g. Hickok & Poeppel, 2007; Rauschecker & Scott, 2009; Saur et al., 2008; Ueno et al., 2011). Dorsal pathway II, which is crucial here, provides a connection between the posterior temporal lobe and BA 44 (also via the AF/SLF) and, in Friederici's view, is crucial for the processing of complex syntax.

the one hand and the ventral and dorsal pathways in syntactic processing on the other. However, Friederici also emphasises the central role of the connectivity provided by her dorsal pathway II in allowing for the processing of syntactically complex sentences on the basis of developmental studies (Brauer, Anwander, & Friederici, 2011). The crucial argument here is that, in contrast to dorsal pathway I, dorsal pathway II matures relatively slowly and that this correlates with problems in the processing of word order variations in children.

Assuming that the primary role of Friederici's dorsal pathway II lies in allowing for the processing of syntactically complex sentences (as claimed in Friederici, 2009, 2011), this approach is challenged empirically by the observation of activation changes within the dorsal stream even in simple sentences, i.e. sentences without embeddings and adhering to basic word order. These have been observed, forexample, in the posterior superior temporal sulcus (pSTS) in response to increased role assignment demands (''who is acting on whom?) (Bornkessel, Zysset, Friederici, von Cramon, & Schlesew-sky, 2005; Grewe et al., 2007) and the posterior superior temporal gyrus (pSTG) for aspects of verb-argument structure processing (Shetreet, Palti, Friedman, & Hadar, 2007). In the latest version of her approach, Friederici (2012) addresses this problem by positing that dorsal pathway II may also be involved in "delivering] predictions to the posterior temporal cortex in a top-down manner'' (Friederici, 2012, p. 265) (e.g. regarding verb valency based on the number of arguments already encountered). It also provides the necessary prerequisite for syntax-semantics integration, which she envisages as taking place in posterior temporal cortex, by allowing for an information flow of syntactic information from BA 44 to the temporal lobe. While these assumptions provide a potential explanation for why dorsal stream activation should also be observed in the comprehension of syntactically simple sentences, they also raise more questions. In particular, how do young children accomplish syntax-semantics integration and the processing of valency information if their dorsal pathway II has not yet matured sufficiently for the processing of complex syntax? Also, if dorsal pathway II is also crucially involved in the processing of syntactically simple sentences, does this not undermine the assumption of a functional separation between a ventral and a dorsal pathway for processing simple versus complex syntax, respectively? In Section 4.2 below, we outline an alternative view of syntactic processing within a dorsal-ventral streams perspective that circumvents some of these problems.

The observation of comprehension-related activation changes within the dorsal stream also challenges the sequential account of the ventral stream (Hickok & Poeppel, 2007; see Section 3.2) as well as all proposals of a principled dichotomy between production and comprehension in the dorsal versus ventral streams (Saur et al., 2008; Ueno et al., 2011). Hickok and Poeppel (2007) address this issue by assuming that posterior temporal regions in fact form part of the ventral stream (as noted above); thus, comprehension-related activations in these posterior temporal regions are to be expected in their account. Their proposal does not, however, explain comprehension-related activations near the temporo-parietal junction (TPJ) or in parietal cortex.

In summary, existing accounts do not provide a satisfactory explanation for why we seem to require a dorsal stream for the comprehension of even simple sentences.

4. A new model

In the following, we will outline a proposal that, in our view, could potentially provide a unified interpretation of the results at the word and sentence levels and, thereby, address the puzzles outlined in the previous section. Empirically, the main puzzle appears to be why we see dorsal stream activation even in the comprehension of simple sentences. We will claim that this observation can be explained if we assume that both the ventral and dorsal streams generally engage in sentence comprehension,

but with differing functions. How, then, might one characterise the differing representations or mechanisms involved? A closer look at the computations thought to be performed by the dorsal and ventral streams reveals the following. Time-dependent processing is typically associated with the dorsal stream, e.g. in terms of ''encoding and storing sound sequences'' (Scott & Wise, 2004, p. 27) or in supporting working memory (Saur et al., 2008). The ventral stream, by contrast, is viewed as more time-independent in most approaches (cf. Rauschecker's auditory objects perspective that was discussed in detail above, or the reference to time-invariant semantic representations in Section 3.2; Ueno et al., 2011). Thus, the basic subdivision between time-independent versus time-dependent processing in the ventral and dorsal streams, respectively, is one key insight underlying our approach. The second is that this subdivision can be couched naturally within an architecture that brings together insights gleaned from time-sensitive, cross-linguistic investigations of sentence processing (Born-kessel-Schlesewsky & Schlesewsky, 2009b) with Rauschecker's central neurobiological assumption of hierarchical processing within the auditory system (Rauschecker, 1998; Rauschecker & Scott, 2009). Note that, although our arguments will be based entirely upon speech and language processing, we do not mean to suggest that the proposed mechanisms (hierarchical processing and the crucial distinction between time-dependent and time-independent information processing) are exclusive to language. Rather, we would assume (see also Footnote 12) that these mechanisms are compatible with domain-general functions of information processing within the two streams.

The principal claims of our proposal are as follows (for further elaboration of each claim, see the following subsections):

(a) Hierarchical processing in the ventral and dorsal streams. We assume that information processing within both the ventral and the dorsal streams is organised hierarchically (emanating from PAC and projecting antero-ventrally and postero-dorsally), but that it subserves differing functions in the form-to-meaning mapping.

(b) Time-independent (ventral) versus time-dependent (dorsal) computations. While sentences may be considered (relatively large) auditory objects and their recognition as such is thus plausibly mediated by the ventral stream, they are also sequences of categories encountered in time. We propose that this time-dependent aspect of sentence processing is accomplished by the dorsal stream. This principled separation is in accordance with the cross-linguistically motivated proposal within the eADM that syntax (category sequences) and sentential semantics are logically independent of one another (Bornkessel & Schlesewsky, 2006; Bornkessel-Schle-sewsky & Schlesewsky, 2008a, 2009b).9

9 This assumption, while not shared by mainstream Chomskyan generative grammar, is widely held otherwise in grammatical theories, e.g. in Lexical Functional Grammar (Bresnan, 2001), Role and Reference Grammar (Van Valin, 2005), Jackendoffs tripartite approach (Jackendoff, 2002) and the simpler syntax framework (Culicover & Jackendoff, 2005). This separation has: (a) been argued to be more adequate for the analysis of different languages, including languages with vastly different characteristics from English (Van Valin & LaPolla, 1997); and (b) been shown to be computationally tractable in computational linguistics applications (for a recent overview of computational applications of Lexical Functional Grammar, see Forst, 2011). In the domain of language processing, and specifically of neurocognitive models of language comprehension, it has been advocated most strongly within the extended Argument Dependency model (eADM), a model which aspires to account for cross-linguistic similarities and differences in the neural bases for language processing at the sentence level (Bornkessel & Schlesewsky, 2006; Bornkessel-Schlesewsky & Schlesewsky, 2008a, 2009b). It is also inherent, to some degree, to the unification-based philosophy of Hagoort's Memory, Unification and Control (MUC) framework (Hagoort, 2003, 2005), though the syntactic representations assumed by Hagoort (based on Vosse & Kempen, 2000) do include functional nodes such as "subject" and "object".

(c) The function of the ventral stream lies in the time-independent identification and unification of conceptual schemata, serving to represent conceptual chunks of increasing size. The property of time-independence refers to the fact that schema combination (unification) is not dependent upon the sequence in which the schemata are activated.

(d) The function of the dorsal stream lies in the identification and combination of successively larger linguistic chunks in time. This comprises the prosodic segmentation of the input and the subsequent combination of elements into category sequences. In addition, it involves the computation of all time-dependent sentence internal relations (e.g. computing which participant is the actor, i.e. the participant primarily responsible for the state of affairs described).

(e) Frontal cortex does not subserve linguistic processing functions. In accordance with the notion of hierarchical processing, we propose that linguistic processing per se only takes place in temporal and parietal regions, but not in frontal cortex. Frontal cortex subserves control functions only and serves to link linguistic processing to behaviour. Moreover, it serves to integrate information from the ventral and dorsal streams and to provide top-down feedback information to each stream.

The overall architecture resulting from these assumptions is shown in Fig. 1.

As is apparent from Fig. 1, the ventral stream in our proposal is responsible for the identification of ''actor-event schemata'' and their unification. As we will describe in more detail below, these schemata essentially correspond to category-neutral semantic representations at the word level. In our approach, they are thereby the key auditory objects that are identified and combined hierarchically to form further auditory objects of increasing internal complexity. Thus, the ventral stream is responsible for building up a sentence-level semantic representation. In accordance with the basic assumptions of the eADM (and of several theories of grammar, see Footnote 9), we posit that this proceeds independently of and in parallel to the establishment of a syntactic (constituent) structure. Our assumptions can therefore be formulated at two distinct levels. In general terms, we claim that the ventral stream is responsible for the identification of conceptually meaningful auditory objects of increasing complexity (operationalised here via the unification of conceptual schemata). In more specific terms, we posit that the schemata involved in this process have particular properties (see below and, in particular, Appendix A).

The dorsal stream, by contrast, is responsible for the time-dependent combination of elements, including segmentation into prosodic units (prosodic words), their combination into a syntactic representation and understanding the action described. Finally, the two streams are integrated in frontal cortex (premotor cortex and inferior frontal gyrus, IFG); we further assume that these frontal regions are responsible for resolving conflicts between streams. In the following subsections, we will describe the different components of the model in more detail as well as empirical evidence supporting them.

4.1. Identifying and combining auditory objects in the ventral stream: Actor-event schemata

If the typical view that the ventral stream processes time-independent (or time-invariant) representations is correct, this stream - by definition - cannot compute syntactic representations. Accordingly, we claim that there is no need to assume syntactic processing within the ventral stream. How, then, might we account for combinatory effects within the ventral stream (see Section 3.2)? We posit that these follow straightforwardly from the assumption

that sentences are merely complex auditory objects that are constructed via the combination of less complex auditory objects (cf. DeWitt & Rauschecker, 2012). This can be accomplished by means of relatively simple schemata, which we term actor-event (AE) schemata. These schemata, which are illustrated in Fig. 2, are suited for the purposes of auditory object identification at the word level and above, because they (a) provide word-level semantic information as well as idiosyncratic lexical restrictions; and (b) allow for schema combination via unification, i.e. in a time-independent manner in accordance with the functional designation of the ventral stream.

AE-schemata have several crucial characteristics: they are category neutral (i.e. not designated as nouns or verbs, for example); they are also actor-centred, as the name suggests (i.e. focus more strongly on the person or thing responsible for the event than the other event participants). Both assumptions are based on theoretical and empirical motivations and are formulated to ensure applicability across typologically different languages. However, since the more general claim - namely that the time-independent identification and combination of conceptual schemata follows naturally from the specification of the ventral stream assumed here - is most central for present purposes, these more specific assumptions will be described and motivated in more detail in Appendix A.

AE-schemata are unified with one another to provide more complex semantic representations. Schema unification thus allows for semantic combinatorics at the phrasal and sentential levels. Fig. 3 provides some examples of AE-schemata created via unification during incremental sentence comprehension. In a nutshell, unification occurs by incorporating one schema into a slot (e.g. ''who'' or ''what'') of another. As in standard definitions of unification (see e.g. Pollard & Sag, 1994), this is possible when the slot is either currently unfilled (e.g. Fig. 3D, in which ''the girl'' is unified with the previously unfilled ''with whom'' slot of the schema in Fig. 3A) or when there is a principled compatibility between the previous representation and the new representation (e.g. Fig. 3B, in which the ''paint'' schema in the ''what'' slot is replaced by ''kiss'', which is possible because the two share the same action type).10 Further details on AE-schema unification, including detailed descriptions of the examples in Fig. 3, are provided in Appendix A.

The conception of AE-schemata and their unification provides a unified explanation for the two seemingly disparate functions of the ventral stream in previous work, namely the mapping from (auditory) form to meaning and the combinatorics of linguistic entities, respectively. As AE-schemata are complete word-level semantic representations, it is clear how their identification during language processing can account for the form-to-meaning mapping at the word level, thus explaining the findings by Saur et al. (2008), DeWitt and Rauschecker (2012) and the results on semantic dementia (Ueno et al., 2011). Furthermore, AE unification accounts for the various data that have been cited as evidence for combinatory structure in the ATL: increased activation of this region for (i) sentences versus word lists (Bottinietal., 1994; Friedericietal.,2000; Mazoyer et al., 1993; Stowe et al., 1998; Xu et al., 2005); (ii) words encountered as part of a phrase as opposed to a non-combinatory context (Bemis & Pylkkanen, 2011); (iii) syntactically more versus less complex sentences as defined via word-by-word measures on various complexity metrics (Brennan et al., 2012); and (iv) category violations at the phrasal level (Friederici, 2002, for a review). Observations (i) through (iii) are relatively straightforward in that they can all be explained in terms of more versus less unification operations

10 Nevertheless, we assume that the switch from one action to another of the same type is associated with some cost (which would manifest itself, for example, as an increased N400 effect in electrophysiological terms). For further details, see Appendix A.

Fig. 2. Sample actor-event schemata (AE-schemata). For details, see the main text and, in particular, Appendix A.

"The painter kissed the girl."

w. , PAINTER Who? ...

(see Fig.2)

What? do(xi,kiss'(xi,y2)) GIRL (see Fig.2)

With whom?

Where? When?

"The painter the girl..."

PAINTER Who? ...

(see Fig.2)

What? do(xi,paint'(xi,y2)) GIRL (see Fig.2)

With whom?

Where? When?

Fig. 3. The unification of AE-schemata. For details, see the main text and, in particular, Appendix A.

(or unification versus no unification). Observation (iv), however, requires a little more explanation, since it may not be readily apparent how the unification of category-neutral schemata can give rise to (apparently) word category-based combinatory violations. For a brief illustration, consider the example Das Eis wurde im gegessen (''The ice cream was in-the eaten''; from Friederici, 2002), in which the participle gegessen induces a category violation when it is encountered after the preposition-determiner amalgamation im. This violation is typically explained in terms of a phrase structural

mismatch, because a verb cannot follow a preposition plus determiner. In AE-schema terms, however, the explanation is somewhat different. Here, a determiner such as the is represented as a modifier for a referential slot within an AE-schema (preferably for the who-slot in accord with the actor preference). When a verb form (i.e. a filler for a what-slot with the AE-schema) is subsequently encountered, the unification requirements for the determiner are not met and unification fails. Hence, increased anterior temporal activation or an early left anterior negativity (ELAN) results. Essentially, combi-

natory effects of this type arise from a mismatch between bottom-up and top-down information sources: bottom-up information designates how likely a schema is to be used for a particular slot (e.g. gegessen as a what-candidate via the participial prefix ge- and prior experience with this word form), while top-down information consists in a sequence-based prediction from the dorsal stream (see below) about an upcoming category. Finally, these assumptions regarding schema unification as a source for combinatorial effects in language accounts for the observation of deficits in simple linguistic combinatorics for patients with anterior temporal lesions (Dron-kers, Wilkins, Van Valin, Redfern, & Jaeger, 2004).

4.2. Combining elements in a time-dependent manner in the dorsal stream

As outlined in the previous section, we envisage the function of the ventral stream as one that involves the time-independent (unification-based) combination of semantic representations. The dorsal stream, by contrast, engages in the time-dependent segmentation and combination of elements. As is apparent from Fig. 1, we assume that this comprises (a) the segmentation of the input into prosodic words, (b) the combination of these elements into a syntactic structure, and (c) the assessment of the elements in this structure in action-related terms (''who is responsible for the event described''). This characterisation of the dorsal stream is broadly in line with the following proposal by Scott and Wise (2004, p. 27): ''[A] stream of processing directed through TpT cortex, including the planum temporale, responsible for encoding and storing sound sequences and acting as a sensori-motor interface for mimicry is the 'how' system of speech and its acquisition.'' We disagree, however, with the common view that this stream is responsible primarily for articulation and repetition, rather than comprehension of speech and language: as outlined above, many empirical findings attest to the fact that language comprehension draws upon the dorsal stream and that this is the case even in simple sentences. The solution that we propose is thus as follows: the dorsal stream indeed processes sound sequences, but does so within both production and comprehension. (For a similar view, see Wise et al. (2001, p. 92) who posit that the left posterior STS ''transiently represent[s] the temporally ordered sound structure of words, both heard words (the external source) and words retrieved from lexical memory (the internal source)'').

We thus concur to some extent with Saur et al. (2008), who link the dorsal stream in language comprehension to working memory: ''involvement of the dorsal stream for processing of complex syntactic operations might be partially explained as a result of an increase in syntactic working memory load'' (Saur et al., 2008, p. 18039). In contrast to Saur and colleagues (and Friederici, 2009), however, we do not view complexity of the sentences in question as a prerequisite for the involvement of the dorsal stream. There are at least three reasons for this claim:

(a) Working memory is always required when a sequence of more than two elements is processed in time. Current assumptions about working memory suggest that only one element (the most recent) remains in the focus of attention, while all less recent items must be retrieved from memory (for reviews, see Jonides et al., 2008; McElree, 2006). There is compelling evidence that this general mechanism also applies during sentence comprehension (e.g. Martin & McEl-ree, 2008; McElree, Foraker, & Dyer, 2003; Van Dyke & McEl-ree, 2006) and even in very simple sentences. Notably, Wagers and McElree (2011) show that displacement from the focus of attention occurs even within phrases, e.g. when a noun is separated from a determiner by a modifying adjective (as in the risk-taking burglar). There is no evidence to

date that the retrieval mechanisms involved in phrase-level processing of this type differ from those required for complex sentences. Moreover, there is no principled reason as to why such a difference should exist; rather, it is the dichotomy between focus of attention and retrieval that manifests itself across the board.

(b) A similar argument holds for syntactic operations. In all existing theories of grammar, syntactic mechanisms are identical between ''simple'' and ''complex'' sentences (i.e. complex sentences may involve an increased number of iterations of these operations, but not the application of qualitatively different principles). Thus, there is no theoretical linguistic basis for the dichotomy between simple and complex syntax (for detailed discussion, see Schlesewsky & Bornkessel-Schlesewsky, 2013).

(c) As already noted in Section 3.3, there is no operationalised definition for when a sentence counts as complex as opposed to simple. Thus, there is no principled way of predicting whether a given sentence should require dorsal stream involvement or not. This introduces the danger of circularity, namely the inference that a sentence must be complex if it engenders dorsal pathway activation.

The assumption that sentence comprehension generally involves the dorsal stream in addition to the ventral stream thus follows from (a) and solves problems (b) and (c). If this is the case, then how are the mechanisms underlying sentence comprehension in the dorsal pathway to be envisaged? In this regard, we will discuss the three functions shown in Fig. 1 in turn.

4.2.1. Prosodic segmentation of the input

A number of existing findings attest to the involvement of posterior temporal regions within the dorsal pathway in prosodic segmentation. For example, Meyer, Steinhauer, Alter, Friederici, and von Cramon (2004) reported that activation in the posterior STG and the planum temporale is modulated by intonational properties (listening to normal versus ''flattened'' speech). Similarly, Ischebeck, Friederici, and Alter (2008) observed increased activation in Heschl's gyrus, mid to posterior STG and the rolandic operculum for the processing of sentences with two as opposed to one intona-tional phrase boundary. Posterior temporal and parietal regions have further been implicated in the processing of speech rhythm (Geiser, Zaehle, Jancke, & Meyer, 2008), an information source which is known to be important for the segmentation of the speech stream into words (e.g. Cutler & Norris, 1988), in the discrimination between speech and non-speech sounds based on temporal modulations of the input (Zaehle, Geiser, Alter, Jancke, & Meyer, 2008), and in auditory stream segregation (Cusack, 2005).

4.2.2. Syntactic structuring

Let us now turn to what will likely be the most controversial aspect of our present proposal, the assumption that syntactic structure building relies upon posterior temporal regions as part of the dorsal stream and, accordingly, that syntax is neither associated with the aTL nor with the IFG. Our initial motivation for entertaining this assumption was theoretical: Recall that standard assumptions regarding the involvement of the lIFG in syntactic processing appear problematic from the perspective that (a) syntax is processed early in all existing models of sentence comprehension (see Section 3.1), and (b) assuming hierarchical processing as a basic neurobiological property (see Rauschecker & Scott, 2009, and Section 2.1), syntactic structuring should be more proximate to primary auditory areas than the lIFG. Situating syntactic structure-building within the temporal lobe solves both of these problems and is highly compatible with the notion that frontal cortex is a region for conflict resolution and cognitive control rather

than linguistic processing per se. The additional assumption that syntactic structure-building forms part of the dorsal as opposed to the ventral stream further guarantees separable functions for dorsal versus ventral stream processing (namely time-dependent computations versus time-independent identification of auditory objects, respectively), while at the same time allowing for functionally unified interpretations of processing within each stream. It thereby solves the remaining puzzles, namely ''What is the function of the ventral stream?'' (3.2) and ''(Why) do we need a dorsal stream for language comprehension?" (3.3).

In accordance with our cross-linguistically motivated view of syntactic structure as logically independent of sentence-level interpretation (for review, see Bornkessel-Schlesewsky & Schlesewsky, 2009b), we assume that syntactic structure-building involves the establishment of structured sequences of linguistic categories. The resulting structures are simple, binary-branching and do not involve syntactic movement. For further details on the structures themselves and how they are established incrementally, see Appendix B.

The claim that syntactic structure building is accomplished by posterior temporal regions is, in our view, supported by several empirical findings (though the authors' original interpretations of these findings were, in some cases, quite different). In an fMRl study on sentence and word list processing in Dutch, Snijders et al. (2009) contrasted category-ambiguous words with unambiguous words in sentence contexts or as part of a word list. They reasoned that main effects of ambiguity across both tasks would be indicative of lexical retrieval, while effects of ambiguity for sentences only should be observable in regions responsible for syntactic structure building (unification in their approach). In the left posterior middle temporal gyrus (lpMTG), they observed an interaction between ambiguity and task, which was due to an ambiguity effect (higher activation for ambiguous versus unambiguous conditions) in sentence but not word list processing. This result is fully compatible with our proposal, since category-ambiguous words give rise to multiple potential structuring options and higher activation of regions involved in structure-building is therefore to be expected.11 In a subsequent study using the same task, Snijders, Petersson, and Hagoort (2010) further demonstrated effective connectivity between the lpMTG and posterior lIFG, thus supporting the assumption of dorsal stream involvement. For language production, too, it has been suggested that posterior temporal regions may be involved in the linearisation of elements, i.e. in determining the linear order of syntactic constituents (Ye, Habets, Jansma, & Münte, 2011). This is an integral part of syntactic structure building.

Further recent results support the proposed dissociation between syntactic structure building as category sequencing within the dorsal stream and AE schema unification as the basis for sentence interpretation within the ventral stream. In an fMRl study on French, Pallier, Devauchelle, and Dehaene (2011) contrasted 12-item word sequences with increasing constituent sizes (i.e. from a word list of 12 singleword constituents to a single constituent consisting of 12 words) using both real French words and pseudowords. As predicted by our proposal, increasing constituent sizes correlated with increasing activation in both the posterior temporal lobe (posterior superior temporal sulcus, pSTS) and the aTL. For the pSTS, we would attribute the in-

11 Snijders et al. (2009) in fact found a very similar pattern of results in left inferior

frontal cortex. We propose that this inferior frontal activation can be attributed to the increased demands on cognitive control that arise when an ambiguous category needs to be integrated into a sentence (for a proposal linking ambiguity, cognitive control and the lIFG, see January et al., 2009; Novick et al., 2005) rather than from syntactic structure building (unification) as assumed by Snijders and colleagues. Note that the assumption of syntactic structuring in the temporal portion of the dorsal stream in combination with control processes in frontal cortex provides a principled explanation why both loci should show activation in a task of this kind, while an account positing that the inferior frontal activation reflects structure building does not.

creased activation to iterative sequencing demands within a constituent as opposed to across constituents (where no predictions can be made for the next category). The aTL activation, by contrast, we would attribute to increasing AE-schema unification demands with increasing constituent size. Crucially, however, only AE-schema unification relies on semantically contentful representations, thus leading to the prediction that such unification demands should only be observable for elements that can be associated with (an) existing AE-schema(ta) (see the discussion of Bemis & Pylkkanen, 2011, above). Indeed, Pallier et al. (2011) found that, while the correlation between constituent size and the magnitude of the BOLD response was independent of the real word/pseudoword distinction for the pSTS, it was only observable for real words for the aTL (aSTS and temporal pole, TP). This result provides strong converging support for our proposal.

4.2.3. Understanding the action described

The final role that we envisage for posterior temporal regions within the dorsal stream is to provide a bridge to action-understanding systems in the brain. In language comprehension, the posterior portion of the superior temporal sulcus (pSTS) has been linked to competition for actorhood in the sense of Appendix A: the higher the degree of competition for the actor role within a sentence, the higher the activation observed within this region (Bornkessel-Schle-sewsky & Schlesewsky, 2009b; Grewe et al., 2007). Increased activation within the pSTS is also observable when assumptions about which sentence participant is the actor need to be reanalysed, e.g. due to the particular properties of a verb encountered in the clause-final position (Bornkessel et al., 2005). Importantly, the pSTS and neighbouring temporo-parietal junction (TPJ) have also been linked to the processing of agency information in non-linguistic contexts (Frith & Frith, 1999) and to the inference of others' intentions or states of mind (Saxe, 2006). In conjunction with the sensitivity of the pSTS for other cues such as biological motion, these observations have led to the suggestion that the pSTS/TPJ region is crucial for action understanding. The fact that actor-inference in language processing also correlates with activation in this region suggests that it may constitute an important interface between the neural language and action understanding systems.

These assumptions tie in well with the proposal that the dorsal "how" stream is linked to mimicry (see the quote by Scott & Wise, 2004, above). Within the scope of the eADM, we have posited that the cross-linguistic importance of the actor role may be due to the fact that the basic actor prototype is the first person singular, i.e. the self (Bornkessel-Schlesewsky & Schlesewsky, in press-a, in press-b; for the assumption of the self as an agent prototype, see also Dahl, 2008; Tomasello, 2003). Thus, identifying an actor and determining its typicality (i.e. its status as a good or bad actor) may crucially involve assess-ingthe similarity ofasentenceparticipant to theselfasanactingagent.

Within the present proposal, the position of the actor-identification/action-understanding step as the final mechanism of understanding within the temporal part of the dorsal stream follows from the fact that a sentence participant's position within a sentence is one of several cues to actorhood. In English, this cue is extremely important, since the first participant in a subject-verb-object sequence must always be interpreted as the actor (for empirical evidence, see MacWhinney, Bates, & Kliegl, 1984). In other languages, position is not as important a cue; it nevertheless still plays a role in view of the language processing system's endeavour to identify the actor as quickly as possible. Indeed, even languages with a considerably more flexible word order than English consistently show an actor-first preference during incremental language comprehension (for reviews, see Bornkessel-Schlesewsky & Schlesewsky, 2009b; Bornkessel-Schlesewsky & Schlesewsky, in press-a, in press-b). Thus, syntactic structure building in the sense outlined above -namely, determining the sequential order of categories within successively larger syntactic constituents (see also Appendix B and

Fig. 4. Syntactic structures assumed within the current approach. Lexical categories (terminal nodes), e.g. noun (N), verb (V), determiner (D), are depicted in italics. All other category labels are arbitrarily chosen in order to reflect the cross-linguistically indeterminate nature of categories at the phrasal and sentential levels. In English, for example, the processing of an initial determiner plus noun (noun phrase) may give rise to the expectation for a verb plus dependent elements (verb phrase; see D), while no such expectation is possible in the majority of languages (see, for example, Haider, 2010, for German).

Fig. 4) - is a crucial prerequisite to actor identification even though it is not the only determining factor for this operation.

In summary, we propose that the dorsal stream engages in time-dependent computations during sentence comprehension. As these types of computations are required for all linguistic constituents of a size that exceeds a single word, they are assumed to apply to all types of sentences, irrespective of their assumed complexity. As in the ventral stream, processing adheres to the fundamental principle of hierarchical organisation, with processing successively further downstream from PAC serving to compute successively more complex representations: prosodic chunking is a necessary prerequisite for syntactic structure building (category sequencing), which, in turn, is a necessary information source for actor identification. Thus, both streams share crucial characteristics, but differ with regard to the types of computations that they perform in the comprehension process.12

4.3. Converging streams: the role of frontal cortex

As already discussed in detail above, one crucial component of the present proposal is that frontal cortex (and particularly the lIFG) does not perform any linguistic processing proper. Rather, all language-inherent processes (though these need not necessarily be specific to language) are assumed to take place within the temporal (and parietal) regions that form part of the dorsal and ventral streams.13 The role of frontal cortex, by contrast, is assumed to be related to cognitive control and conflict resolution, following a proposal by Thompson-Schill and colleagues (e.g. Novick et al., 2005; Thompson-Schill, D'Esposito, Aguirre, & Farah, 1997; ThompsonSchill et al., 2005). Part of this control-based mechanism is, in our view, to bring together the different representations generated by the dorsal and ventral streams,14 i.e. the time-independent semantic representation (ventral; see Fig. 3) with the time-dependent and actor-related structure (dorsal; see Fig. 4) as a prerequisite for action

12 An architectural consequence of these assumptions is that there are, indeed, only two macroscopically diverging processing streams, the overarching functions of which can be described - in the speech and language domain - as time-dependent sequence processing (dorsal stream) and time-independent identification of auditory objects (ventral stream). Crucially, though we have focused exclusively on speech and language here, we assume that these characterisations are compatible with the domain-general functions of the two streams (i.e. with regard to information processing of any kind). Furthermore, while we would not want to exclude the possibility that, in addition to anatomical subpathways, there are also functional substreams within each stream, the present account predicts that these should conform with the assumed overarching functions of their respective stream.

13 While we have not discussed parietal regions here, we assume that they play an important role in linking individual sentences to the broader discourse via further relational categories at the sentence level and an interaction with attentional systems (Bornkessel-Schlesewsky & Schlesewsky, in press-a, in press-b).

14 We thus agree, to some degree, with Hagoort's (2005) proposal that the lIFG plays

a crucial role in the integration of information from different linguistic domains. In

contrast to Hagoort, however, we do not assume that processing within a domain (e.g. syntactic structuring or unification of semantic representations) is accomplished in inferior frontal regions. These aspects of processing are all accomplished within temporal/parietal regions in the present account.

planning and execution via the premotor cortex. In this respect, we assume that cognitive control mechanisms in frontal cortex are structured in a hierarchical anterior-to-posterior gradient (from frontopolar to premotor cortex),15 with more anterior regions performing control mechanisms that draw upon successively more remote information (Koechlin & Summerfield, 2007; Koechlin et al., 2003). We have recently proposed that this control gradient can also be applied to language processing such that successively more anterior regions within the gradient correlate with control operations for linguistic information units of increasing scope, i.e. from single word-based information up to pragmatic appropriateness (Bornkes-sel-Schlesewsky & Schlesewsky, 2012; Bornkessel-Schlesewsky et al., 2012). In accordance with Koechlin and colleagues' proposal, this means that less local (more anterior) control signals can override more local (more posterior) ones in determining action planning and execution. Thus, pragmatic requirements can override literal meaning, for example, as in the contextually appropriate interpretation of it's rather cold in here as a request to close the window. This perspective of hierarchically structured cognitive control processes in lateral frontal cortex in addition to linguistic processing proper in temporal/parietal regions has a number of advantages:

(a) It allows for top-down feedback based on the convergence of both streams, which can modulate the processing of the next input item (word). For example, as already noted briefly in Section 4.1, the prediction for an upcoming word category that stems from syntactic structuring within the dorsal stream can be used to constrain AE-schema unification for the next word within the ventral stream, specifically whether an element is assumed to play a predicating ("what") or a referential ("who" or "with whom") role.

(b) It can explain seemingly paradoxical observations regarding the hierarchical structure of language comprehension mechanisms. As already pointed out in Bornkessel-Schlesewsky et al. (2012), the proposal that pragmatic information can override more local linguistic information - as required by the frontal gradient - does not appear compatible with observations about the timing of language processing. However, this problem can be overcome if one assumes that linguistic processing per se takes place in regions other than the inferior frontal cortex and only the frontal control processes apply according to this hierarchy. The same logic can be applied to explain the following apparent paradox regarding the relation between basic combinatory operations and semantic interpretation. On the one hand, neuro-linguistic findings suggest that, when a word category cannot be integrated into the current sentence context, this blocks subsequent semantic integration (i.e. an early left-anterior negativity, ELAN, "blocks" an N400, see Friederici, 2002).16 Nevertheless, we can understand sentences involving a category error and interpret them as action instructions: for example, when addressed by a tourist with "Please, where subway?'', we can point him/her in the direction of the next subway station in spite of the category error that is induced by the fact that the wh-pronoun is not followed by a verb. This follows straightforwardly from the present proposal. The absence of an N400 following an ELAN results from the fact that both semantic integration and basic combinatorics are

15 Possibly, premotor cortex should be viewed as playing a dual role, namely: (a) in the dorsal pathway's auditory-to-motor mapping, and (b) as part of frontal control structures. The extent to which these two functions can be dissociated empirically remains an interesting question for future research.

16 For a detailed discussion of why findings of an N400 preceding an anterior negativity when the semantic processing problem can be recognised prior to the category violation (van den Brink & Hagoort, 2004) are not a counterexample to this claim, see Bornkessel-Schlesewsky and Schlesewsky (2009a) and Footnote 5.

aspects of AE-schema unification within the ventral stream (see Section 4.1). When schema unification fails,17 no semantic integration takes place between the different schemata. Thus, the ventral stream passes unintegrated schema representations on to frontal cortex, where (i) conflict resolution is attempted; and (ii) interpretation proceeds following general communicative requirements as applied to the associated schema representations. This can lead to action execution, e.g. giving appropriate instructions to the tourist, even though the dorsal stream has failed to produce a coherent output.

(c) Conflict resolution, as described under (b), is viewed as involving the IFG irrespective of the linguistic domain that is involved in generating the conflict. This is in line with recent findings showing that syntactic and spelling violations both lead to increased lIFG activation, i.e. that the lIFG does not appear to show a specialisation for particular linguistic domains in conflict resolution (van de Meerendonk, Indefrey, Chwilla, & Kolk, 2011).

(d) It is compatible with a range of observations on deficit-lesion correlations in language comprehension. While an association between Broca's region and syntax was long propagated on the basis of reports of agrammatism in Broca's aphasics (e.g. Caramazza & Zurif, 1976; see also Grodzin-sky, 2000), it is empirically highly problematic. Firstly, there is no one-to-one correlation between a diagnosis of Broca's aphasia and a lesion in Broca's region (e.g. Caplan, 2000; Dick & Bates, 2000). Secondly, and relatedly, symptoms related to agrammatism have been reported for a range of aphasic syndromes and lesion sites (e.g. Caplan, 2000; Dick & Bates, 2000; Penke & Wimmer, 2012), thus leading Dick and Bates (2000, p. 29) to conclude that "this profile has absolutely no localizing value''. Thirdly, recent studies employing voxel-based lesion-symptom mapping (VLSM, Bates et al., 2003) to examine correlations between lesion site and linguistic performance in relatively large samples of patients point to temporal rather than inferior frontal lesions as detrimental to syntactic aspects of language comprehension (English: Dronkers et al., 2004; Icelandic: Mag-nusdottir et al., 2012).18

On a final note, we would like to point out that the top-down feedback mechanism based on frontal cortex is only one of two feedback mechanisms assumed within the present framework.

17 In this particular example, the unification failure results from the interplay of top-down information (the expectation for a verb based on the previous processing of the wh-pronoun within the dorsal stream) and bottom-up information (the experience-based information that subway is not typically used as a ''what'' category in the sense of an AE-schema in English) in combination with the preceding wh-pronoun.

18 Interestingly, both of these studies point to a correlation between the anterior (rather than the posterior) temporal lobe and syntactic comprehension deficits (e.g. in the processing of non-actor-initial sentences Magnusdottir et al., 2012). At a first glance, this might appear problematic for our proposal. However, recall first of all that sentence-level semantic interpretation is associated with AE-schema unification in our account, thus rendering an association between comprehension deficits and the aTL expected. There are several possibilities as to how the precise deficits observed might come about. On the one hand, as noted in Appendix A, AE-schema unification proceeds according to a number of non-syntactic heuristics, e.g. degree of association between different schemata and the preference to fill the actor (who) slot in accordance with the actor-oriented nature of the schemata. This could very plausibly result in deficits in the comprehension of non-actor-initial sentences - irrespective of any purported levels of syntactic complexity. On the other hand, deficits of this type could result from disruptions in intra-temporal lobe connectivity, i.e. between anterior and posterior temporal regions. A prediction of our account is that successful language comprehension requires a binding of the time-sensitive representations computed within the postero-dorsal temporal pathway with the time-independent representations computed within the antero-ventral temporal pathway. ln addition to relying on the integrational role of frontal cortex, this could potentially be accomplished by intra-temporal interactions, assuming that the appropriate anatomical connections exist.

While we have focused mainly on feedforward mechanisms within the temporal parts of the dorsal and ventrals pathways within the present paper, we assume that these connections are bidirectional as well. In contrast to the feedback via frontal cortex, however, which we assume to apply across items (i.e. from one word to the next), intra-temporal feedback can occur within items (i.e. within the same word) as well (see also Footnote 16). The neuro-anatomical basis for this intra-temporal connectivity will need to be specified in more detail in future research.

5. Summary and conclusion

In the present paper, we have outlined a new framework for sentence comprehension within the dorsal and ventral streams based on an extension and neuroanatomical specification of the extended Argument Dependency Model (Bornkessel & Schlesewsky, 2006) in conjunction with well-established principles of neurobio-logical organisation such as hierarchical processing within the auditory system (Rauschecker, 1998; Rauschecker & Scott, 2009). By proposing that language-inherent processing mechanisms are confined to the temporal lobe (and possibly the parietal lobe, though this was not discussed here), with frontal regions only accomplishing more general aspects of cognitive control in the service of action planning and execution, we have resolved a number of theoretical and empirical puzzles within the existing dorsalventral streams literature. Specifically, we have addressed previous incompatibilities between time and space and inconsistencies in the specification of dorsal and ventral stream functions. While we acknowledge that the present proposal will need to be spelled out in a greater degree of neuroanatomical detail in the future, we believe that it provides a promising initial step forward in the endeavour to model the relationship between language and the brain at the sentence level and above.


Parts of the research reported here were supported by the LOEWE programme (funded by the German state of Hesse) as part of the project ''Exploring fundamental linguistic categories''.

Appendix A. Actor-event (AE) schemata: properties and unification

In the following, we describe the AE-schemata posited here in more detail. To this end, we first motivate the properties of category neutrality and actor-centredness, which characterise the schemata themselves (Appendix A.1), before going on to describe the mechanisms of schema unification (Appendix A.2).

A.1. Properties of AE-schemata

As noted in the main text, we assume that AE-schemata are category neutral and actor-centred. Both of these characteristics will be motivated in detail in the following.

Category neutrality posits that AE-schemata are not specified lexically for a particular word category (part of speech). This means that the ''paint'' schema in Fig. 2, for example, equally applies to the verb to paint and the noun painter. In the verb case, the ''what'' part of the schema, which describes the action or state of affairs, is relevant; in the noun case, by contrast, it is the ''who'' part of the schema, i.e. the person or thing performing the action or implicated in the state of affairs. The motivation for assuming category neutral schema of this type is threefold. Firstly, the long-held assumption that word categories such as noun or verb are important organising categories in the neural representation of language

is not supported by the overall set of findings on this question. Specifically, category differences do not manifest themselves at the word level when meaning is controlled for (for review, see Vig-liocco, Vinson, Druks, Barber, & Cappa, 2011) and only emerge in specific sentence contexts. Vigliocco and colleagues interpret this observation as being most compatible with an emergentist view of lexical categories, according to which categories are not lexically specified but rather emerge from the combination of different types of constraints. (These include: semantic prototypicality (nouns prototypically refer to objects and verbs to actions); distributional cues (nouns and verbs tend to occur in different sentence environments); morphological cues (nouns and verbs tend to differ with regard to the way in which they are marked grammatically); phonological typicality (there are subtle phonological differences between nouns and verbs).) Secondly, a similar argument can be made on the basis of cross-linguistic observations, since not all languages show a lexical distinction between different word classes, i.e. some languages allow for the same lexemes to be used as ''verbs'' and as ''nouns'' as is the case for category-ambiguous words (e.g. cut, train) in English (for a recent review, see Bisang, 2010). For this reason, many cross-linguistic approaches to word categories are also emergentist in nature (e.g. Croft, 2001). In summary, the category neutrality of the AE-schemata assumed here is compatible with psycholinguistic and neurolinguistic results on the processing of different word categories; it is also adequate for describing lexical semantics across languages.

Thirdly and finally, by representing an action and its corresponding actor using the same schema, AE-schemata naturally allow for a derivation of the ''actor-action compatibility effect'', i.e. the fact that, during language comprehension, we tend to expect that participants will perform an activity that is congruent with their own identity (Corrigan, 2001, 2002). For example, a noun that is negatively biased (e.g. murderer) is more likely to be interpreted as the causer of an event that shows a similar bias (e.g. harass) as opposed to a positively-biased event (e.g. praise). This compatibility between actor and action is inherent to the formulation of our AE-schemata, since actor and action are defined as mutually interdependent. (For more details on how this property affects schema unification, see section A.2 below.)

A.1.1. Actor centrality

The assumption of actor-centred event schemata is based on a generalisation observed in studies of sentence comprehension across a range of typologically diverse languages. In languages as diverse as German, Turkish, Chinese, Hindi and Tamil, the language comprehension system attempts to identify the person/thing primarily responsible for the state of affairs being described (the ''actor'') as quickly and unambiguously as possible (for reviews, see Bornkessel-Schlesewsky & Schlesewsky, 2009b, in press-a). This results in a preference for both actor-initiality and actor prototypi-cality, i.e. the first sentence participant encountered is interpreted as the actor if at all possible, even in languages in which the notions of actor and grammatical subject do not overlap; and atypical (e.g. inanimate) actors engender a cross-linguistically comparable response. As argued in detail in Bornkessel-Schlesew-sky and Schlesewsky (2009b), this overall pattern of results can be explained if we assume that sentence participants compete for the actor role. Accordingly, AE-schema unification also involves an actor preference such that a participant will be preferentially integrated into the ''who'' (i.e. actor) slot if this slot is not already filled.

A.2. AE-schema unification

Here, we illustrate how AE-schema unification allows for semantic combinatorics at the phrasal and sentential levels on the basis of several examples. To this end, we will discuss both

the English sentence The painter kissed the girl and the equivalent verb-final structure in German .. .dass der Maler das Mädchen küsste (lit: that the painter (NOM) the girl (NOM/ACC) kissed,''. ..that the painter kissed the girl''). For ease of illustration, we only discuss the unification of noun phrases (NPs) and verbs here, glossing over the details of how determiners (e.g. the) and nouns are combined to form NPs (for the representation of a determiner, see Fig. 2D).

Example 1: The painter kissed the girl. When the first NP is processed, the AE-schema for paint (Fig. 2A) is activated (AE-schema identification in Fig. 1). Via positional information, it is interpreted as a referent rather than a predicate. The positional information is provided top-down from frontal cortex based on the syntactic structure that has already been built up; see Section 4.3. In the case of a sentence-initial constituent as in the present case, the context is null and thereby leads to a preference for a referential rather than predicative interpretation (we assume that the human language processing system universally prefers referent (or noun phrase) initial structures, as reflected, for example, in the fact that only a small proportion (approximately 10%) of the world's languages show a verb-initial basic word order; Dryer, 2005). Following schema identification, a propositional representation is set up by means of unification (AE-schema unification in Fig. 1). This is achieved by unifying the "paint" schema with itself, i.e. by unifying one instance of the schema with the who-position of another instance of the schema (see Fig. 3A). This unification step reflects

(a) the preference for the initial participant to be interpreted as an actor, and (b) the actor-action compatibility preference, i.e. the tendency to expect a particular actor to perform a semantically compatible action (e.g. for a painter to paint). Once the second constituent, kiss, is encountered, the AE-schema in 2C is activated and identified as a predicative usage via the past tense morpheme (-ed) and sentential position (following an initial noun phrase). While the morphological cue is processed bottom-up within the ventral pathway, since morphemes are also auditory objects, the positional cue is again a top-down cue from frontal cortex. In the ensuing schema unification step, the schema in 2C (kiss) needs to be unified with the schema in 3A (the painter). This requires that the action representation do'(x(paint'(x,y)) is replaced with do'(x(kiss'(x,y)), resulting in 3B. While this involves an initial unification conflict, it is resolved rather easily since the overall event structure is highly compatible between the two actions and only the type of activity needs to be replaced. This is in line with previous results showing that preverbal constituents are used to set up (a) expectations about the particular lexical nature of the verb (i.e. paint is more likely in the context of painter than kiss, e.g. Kutas, DeLong, & Smith, 2011) and

(b) expectations about verb class (Bornkessel, Schlesewsky, & Fried-erici, 2003; Bornkessel et al., 2005; Bornkessel-Schlesewsky & Schlesewsky, 2008b; Demiral, 2008). Finally, when the second argument (the girl) is encountered, it is unified into the ''with whom''-position of the active schema in 3B, yielding 3C.

Example 2: ...dass der Maler das Mädchen küsste (lit: 'that the painter the girl kissed'). Here, the first noun phrase is processed exactly as in English. When the second constituent is encountered, it is integrated into the with whom-position of the schema in 3A, yielding 3D. Subsequently, when the clause-final verb is reached, schema 2C is unified with 3D, again requiring the replacement of do'(x(paint'(x,y)) with do'(x(kiss'(x,y)) and thus again yielding 3C. Thus, as is apparent from these simple examples, the unification of AE-schemata accounts for the processing of sentences with different verb-argument orders in a homogeneous fashion and explains how both verbs and arguments can induce predictions about upcoming constituents (for predictability based on verbs, see for example, Altmann & Kamide, 1999; for predictability based on arguments, see Bornkessel et al., 2003, 2005; Bornkessel-Schle-sewsky & Schlesewsky, 2008b; Demiral, 2008).

Appendix B. Syntactic structure-building

We envisage the mechanisms of syntactic processing as shown in Fig. 4. As is apparent from the figure, we assume simple, binary-branching structures without any kind of syntactic movement (note, for example, that the subject- and object-initial sentences shown in Fig. 4B and C have exactly the same structure apart from the order of subject and object). These types of structures are compatible with a range of assumptions in various grammatical theories, including Merge and Bare Phrase Structure from Chomskyan generative grammar (Chomsky, 1995, 2000) as well as the notion of "surface true" constituent structures without movement in approaches such as Role and Reference Grammar (Van Valin, 2005), Simpler Syntax (Culicover & Jackendoff, 2005) and Lexical Functional Grammar (Bresnan, 2001).

Fig. 4D shows how incremental structure building works. In particular, it demonstrates how a binary branching structure results as a natural byproduct of processing the words in a sentence as a temporal sequence. We assume that the category labels shown in the figure are the result of an emergentist learning procedure (see Section 4.1), in which a speaker of a particular language learns to associate particular word forms in that language with particular syntactic environments. These distributional constraints on the environment in which particular types of words occur can further be used to set up predictions for upcoming categories (as shown in Fig. 4D). These are transferred to frontal regions along the dorsal stream, from where they can be used in a top-down fashion to constrain the function (e.g. predication versus reference) of an AE-schema during unification within the ventral stream. Crucially, the position of a word or phrase within the structures shown in Fig. 4 does not determine the interpretation of that word or phrase. There is, for example, no designated "subject" position which could be taken as evidence for a noun phrase to be interpreted as the instigator of an event (the actor). Sentence-level interpretation is exclusive to AE-schema unification within the ventral stream.


Altmann, G. T. M., & Kamide, Y. (1999). Incremental interpretation at verbs: Restricting the domain of subsequent reference. Cognition, 73, 247-264.

Baggio, G., & Hagoort, P. (2011). The balance between memory and unification in semantics: A dynamic account of the N400. Language and Cognitive Processes, 26, 1338-1367.

Bates, E., Wilson, S. M., Saygin, A. P., Dick, F., Sereno, M. I., Knight, R. T., et al. (2003). Voxel-based lesion-symptom mapping. Nature Neuroscience, 6, 448-450.

Bemis, D. K., & Pylkkanen, L. (2011). Simple composition: A magnetoencephalography investigation into the comprehension of minimal linguistic phrases. The Journal of Neuroscience, 31, 2801-2814.

Bisang, W. (2010). Word classes. In J. J. Song (Ed.), The Oxford handbook of language typology. Oxford: Oxford University Press.

Bornkessel, I., & Schlesewsky, M. (2006). The extended argument dependency model: A neurocognitive approach to sentence comprehension across languages. Psychological Review, 113, 787-821.

Bornkessel, I., Schlesewsky, M., & Friederici, A. D. (2003). Eliciting thematic reanalysis effects: The role of syntax-independent information during parsing. Language and Cognitive Processes, 18, 268-298.

Bornkessel, I., Zysset, S., Friederici, A. D., von Cramon, D. Y., & Schlesewsky, M. (2005). Who did what to whom? The neural basis of argument hierarchies during language comprehension. Neuroimage, 26, 221-233.

Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2010). Grammar and sequencing in natural language: The role of hierarchically-ordered cognitive control signals in prefrontal cortex. Poster presented at the Neurobiology of Language Conference, San Diego, CA.

Bornkessel-Schlesewsky, I., & Schlesewsky, M. (in press-a). Neurotypology: Modelling cross-linguistic similarities and differences in the neurocognition of language comprehension. In M. Sanz, I. Laka, & M. K. Tanenhaus (Eds.), The cognitive and biological basis for linguistic structure: New approaches and enduring themes. Oxford: Oxford University Press.

Bornkessel-Schlesewsky, I., & Schlesewsky, M. (in press-b). Competition in argument interpretation: Evidence from the neurobiology of language. In A. Malchukov, E. Moravcsik, & B. MacWhinney (Eds.), Competing motivations. Oxford: Oxford University Press.

Bornkessel-Schlesewsky, I., Grewe, T., & Schlesewsky, M. (2012). Prominence vs. aboutness in sequencing: A functional distinction within the left inferior frontal gyrus. Brain and Language, 120, 96-107.

Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2008a). An alternative perspective on "semantic P600'' effects in language comprehension. Brain Research Reviews, 59, 55-73.

Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2009a). Processing syntax and morphology: A neurocognitive perspective. Oxford: Oxford University Press.

Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2009b). The role of prominence information in the real time comprehension of transitive constructions: A cross-linguistic approach. Language and Linguistics Compass, 3,19-58.

Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2012). Linguistic sequence processing and the prefrontal cortex. The Open Medical Imaging Journal, 6, 47-61.

Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2008b). Unmarked transitivity: A processing constraint on linking. In R. D. Van Valin, Jr. (Ed.), Investigations of the syntax-semantics-pragmatics interface (pp. 413-434). Amsterdam: John Benjamins.

Bottini, G., Corcoran, R., Sterzi, R., Paulescu, E., Schenone, P., Scarpa, P., et al. (1994). The role of the right hemisphere in the interpretation of figurative aspects of language: A positron emission tomography study. Brain, 117,1241-1253.

Brauer, J., Anwander, A., & Friederici, A. D. (2011). Neuroanatomical prerequisites for language functions in the maturing brain. Cerebral Cortex, 21, 459-466.

Brennan, J., Nir, Y., Hasson, U., Malach, R., Heeger, D. J., & Pylkkanen, L (2012). Syntactic structure building in the anterior temporal lobe during natural story listening. Brain and Language, 120,163-173.

Bresnan, J. (2001). Lexical functional grammar. Oxford: Blackwell.

Caplan, D. (2000). Lesion location and aphasic syndrome do not tell us whether a patient will have an isolated deficit affecting the coindexation of traces. Behavioral and Brain Sciences, 23, 25-27.

Caramazza, A., & Zurif, E. (1976). Dissociation of algorithmic and heuristic processes in language comprehension: Evidence from aphasia. Brain and Language, 3, 572-582.

Catani, M., Jones, D. K., & ffytche, D. H. (2005). Perisylvian language networks of the human brain. Annals of Neurology, 57, 8-16.

Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press.

Chomsky, N. (2000). Minimalist inquiries: The framework. In R. Martin, D. Michaels, & J. Uriagereka (Eds.), Step by step: Essays in minimalist syntax in honor of Howard Lasnik (pp. 89-155). Cambridge, MA: MIT Press.

Connolly, J. F., & Phillips, N. A. (1994). Event-related potential components reflect phonological and semantic processing of the terminal word of spoken sentences. Journal of Cognitive Neuroscience, 6, 256-266.

Corrigan, R. (2001). Implicit causality in language: Event participants and their interactions. Journal of Language and Social Psychology, 20, 285-320.

Corrigan, R (2002). The influence of evaluation and potency on perceivers' causal attributions. European Journal of Social Psychology, 32, 363-382.

Croft, W. A. (2001). Radical construction grammar: Syntactic theory in typological perspective. Oxford: Oxford University Press.

Culicover, P. W., & Jackendoff, R. (2005). Simpler syntax. Oxford: Oxford University Press.

Cusack, R. (2005). The intraparietal sulcus and perceptual organization. Journal of Cognitive Neuroscience, 17, 641-651.

Cutler, A., & Norris, D. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception and Performance, 14,113-121.

Dahl, O. (2008). Animacy and egophoricity: Grammar, ontology and phylogeny. Lingua, 118,141-150.

Demiral, §. B. (2008). Incremental argument interpretation in Turkish sentence comprehension. Leipzig: Max Planck Series in Human Cognitive and Brain Sciences.

DeWitt, I., & Rauschecker, J. P. (2012). Phoneme and word recognition in the auditory ventral stream. Proceedings of the National Academy of Sciences, E505-E514.

Dick, F., & Bates, E. (2000). Grodzinsky's latest stand - Or, just how specific are "lesion-specific" deficits? Behavioral and Brain Sciences, 23, 29.

Dronkers, N. F., Wilkins, D. P., Van Valin, R. D., Jr., Redfern, B. B., & Jaeger, J. J. (2004). Lesion analysis of the brain areas involved in language comprehension. Cognition, 92, 145-177.

Dryer, M. S. (2005). Order of subject, object, and verb. In M. Haspelmath, M. S. Dryer, D. Gil, & B. Comrie (Eds.), The world atlas of language structures (pp. 330-334). Oxford: Oxford University Press.

Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1, 1-47.

Forst, M. (2011). Computational aspects of lexical functional grammar. Language and Linguistics Compass, 5,1-18.

Frazier, L., & Clifton, C. Jr., (1996). Construal. Cambridge, MA: MIT Press.

Friederici, A. D. (2002). Towards a neural basis of auditory sentence processing. Trends in Cognitive Sciences, 6, 78-84.

Friederici, A. D. (2009). Pathways to language: Fiber tracts in the human brain. Trends in Cognitive Sciences, 13,175-181.

Friederici, A. D. (2011). The brain basis of language processing: From structure to function. Physiological Reviews, 91,1357-1392.

Friederici, A. D. (2012). The cortical language circuit: From auditory perception to sentence comprehension. Trends in Cognitive Sciences, 16, 262-268.

Friederici, A. D. (1999). The neurobiology of language comprehension. In A. D. Friederici (Ed.), Language comprehension: A biological perspective (pp. 263-301). Berlin/Heidelberg/New York: Springer.

Friederici, A. D., Meyer, M., & von Cramon, D. Y. (2000). Auditory language comprehension: An event-related fMRI study on the processing of syntactic and lexical information. Brain and Language, 75, 465-477.

Frith, C. D., & Frith, U. (1999). Interacting minds - A biological basis. Science, 286, 1692-1695.

Geiser, E., Zaehle, T., Jancke, L., & Meyer, M. (2008). The neural correlate of speech rhythm as evidenced by metrical speech processing. Journal of Cognitive Neuroscience, 20, 541-552.

Glasser, M. F., & Rilling, J. K. (2008). DTI tractography of the human brain's language pathways. Cerebral Cortex, 18, 2471-2482.

Grewe, T., Bornkessel-Schlesewsky, I., Zysset, S., Wiese, R., von Cramon, D. Y., & Schlesewsky, M. (2007). The role of the posterior superior temporal sulcus in the processing of unmarked transitivity. Neuroimage, 35, 343-352.

Grodzinsky, Y. (2000). The neurology of syntax: Language use without Broca's area. Behavioral and Brain Sciences, 23, 1-71.

Hagoort, P. (2003). How the brain solves the binding problem for language: A neurocomputational model of syntactic processing. Neuroimage, 20, S18-S29.

Hagoort, P. (2005). On Broca, brain, and binding: A new framework. Trends in Cognitive Sciences, 9, 416-423.

Hagoort, P. (2008). The fractionation of spoken language understanding by measuring electrical and magnetic brain signals. Philosophical Transactions of the Royal Society B, 363,1055-1069.

Hagoort, P., & van Berkum, J. J. A. (2007). Beyond the sentence given. Philosophical Transactions of the Royal Society B, 362, 801-811.

Hahne, A., & Friederici, A. D. (2002). Differential task effects on semantic and syntactic processes as revealed by ERPs. Cognitive Brain Research, 13, 339-356.

Haider, H. (2010). The syntax of German. Cambridge: Cambridge University Press.

Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: A framework for understanding aspects of the functional neuroanatomy of language. Cognition, 92, 67-99.

Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8, 393-402.

Ischebeck, A. K., Friederici, A. D., & Alter, K. (2008). Processing prosodic boundaries in natural and hummed speech: An fMRI study. Cerebral Cortex, 18, 541-552.

Jackendoff, R. (2002). Foundations of language. Oxford: Oxford University Press.

January, D., Trueswell, J. C., & Thompson-Schill, S. L. (2009). Co-localization of Stroop and syntactic ambiguity resolution in Broca's area: Implications for the neural basis of sentence processing. Journal of Cognitive Neuroscience, 21, 2434-2444.

Jonides, J., Lewis, R. L., Nee, D. E., Lustig, C. A., Berman, M. G., & Moore, K. S. (2008). The mind and brain of short-term memory. Annual Review of Psychology, 59, 193-224.

Koechlin, E., Ody, C., & Kouneiher, F. (2003). The architecture of cognitive control in the human prefrontal cortex. Science, 302,1181-1185.

Koechlin, E., & Summerfield, C. (2007). An information theoretical approach to prefrontal executive function. Trends in Cognitive Sciences, 11 , 229-235.

Kutas, M., DeLong, K. A., & Smith, N. J. (2011). A look around at what lies ahead: Prediction and predictability in language processing. In M. Bar (Ed.), Predictions in the brain: Using our past to generate a future (pp. 190-207). New York: Oxford University Press.

MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S. (1994). The lexical nature of syntactic ambiguity resolution. Psychological Review, 101, 676-703.

MacWhinney, B., Bates, E., & Kliegl, R (1984). Cue validity and sentence interpretation in English, German and Italian. Journal of Verbal Learning and Verbal Behavior, 23,127-150.

Magnusdottir, S., Fillmore, P., den Ouden, D. B., Hjaltason, H., Rorden, C., Kjartansson, O., et al. (2012). Damage to left anterior temporal cortex predicts impairment of complex syntactic processing: A lesion-symptom mapping study. Human Brain Mapping.

Marslen-Wilson, W. (1987). Functional parallelism in spoken word recognition. Cognition, 25, 71-102.

Marslen-Wilson, W., & Welsh, A. (1978). Processing interactions and lexical access during word recognition in continuous speech. Cognitive Psychology, 10, 29-63.

Martin, A. E., & McElree, B. (2008). A content-addressable pointer mechanism underlies comprehension of verb-phrase ellipsis. Journal of Memory and Language, 58, 879-906.

Mazoyer, B. M., Tzourio, N., Frak, V., Syrota, A., Murayama, N., Levrier, O., et al. (1993). The cortical representation of speech. Journal of Cognitive Neuroscience, 5, 467-479.

McElree, B. (2006). Accessing recent events. In B. H. Ross (Ed.). The psychology of learning and motivation (Vol. 46, pp. 155-200). San Diego CA: Academic Press.

McElree, B., Foraker, S., & Dyer, L (2003). Memory structures that subserve sentence comprehension. Journal of Memory and Language, 48, 67-91.

Mesulam, M. M. (1998). From sensation to cognition. Brain, 121,1013-1052.

Meyer, M., Steinhauer, K., Alter, K., Friederici, A. D., & von Cramon, D. Y. (2004). Brain activity varies with modulation of dynamic pitch variance in sentence melody. Brain and Language, 89, 277-289.

Novick, J. M., Trueswell, J. C., & Thompson-Schill, S. L. (2005). Cognitive control and parsing: Reexamining the role of Broca's area in sentence comprehension. Cognitive, Affective and Behavioral Neuroscience, 5, 263-281.

Pallier, C., Devauchelle, A. D., & Dehaene, S. (2011). Cortical representation of the constituent structure of sentences. Proceedings of the National Academy of Sciences, 108, 2522-2527.

Penke, M., & Wimmer, E. (2012). Irregularity in inflectional morphology - Where language deficits strike. In J. van der Auwera, T. Stolz, A. Urdze, & H. Otsuka

(Eds.), Irregularity in morphology (and beyond) (pp. 101-123). Berlin: Akademie Verlag.

Pollard, C., & Sag, I. (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press.

Pulvermüller, F. (2010). Brain embodiment of syntax and grammar: Discrete combinatorial mechanisms spelt out in neuronal circuits. Brain and Language, 112,167-179.

Pulvermüller, F., Shtyrov, Y., & Hauk, O. (2009). Understanding in an instant: Neurophysiological evidence for mechanistic language circuits in the brain. Brain and Language, 110, 81-94.

Rauschecker, J. P. (1998). Cortical processing of complex sounds. Current Opinion in Neurobiology, 8, 516-521.

Rauschecker, J. P., & Scott, S. K. (2009). Maps and streams in the auditory cortex: Nonhuman primates illuminate human speech processing. Nature Neuroscience, 12, 718-724.

Rauschecker, J. P., Tian, B., & Hauser, M. (1995). Processing of complex sounds in the macaque nonprimary auditory cortex. Science, 268,111-114.

Rayner, K., & Clifton, C. J. (2009). Language processing in reading and speech perception is fast and incremental: Implications for event-related potential research. Biological Psychology, 80, 4-9.

Saur, D., Kreher, B. W., Schnell, S., Kümmerer, D., Kellmeyer, P., Vry, M.-S., et al. (2008). Ventral and dorsal pathways for language. Proceedings of the National Academy of Sciences, 105,18035-18040.

Saxe, R. (2006). Uniquely human social cognition. Current Opinion in Neurobiology, 16, 235-239.

Schlesewsky, M., & Bornkessel-Schlesewsky, I. (2013). Computational primitives in syntax and possible brain correlates. In C. Boeckx & K. K. Grohmann (Eds.), The Cambridge Handbook of Biolinguistics (pp. 257-282). Cambridge: Cambridge University Press.

Scott, S., Blank, C., Rosen, S., & Wise, R. (2000). Identification of a pathway for intellgible speech in the left temporal lobe. Brain, 123, 2400-2406.

Scott, S. K., & Wise, R. J. S. (2004). The functional neuroanatomy of prelexical processing in speech perception. Cognition, 92,13-45.

Sereno, S. C., & Rayner, K. (2003). Measuring word recognition in reading: Eye movements and event-related potentials. Trends in Cognitive Sciences, 7, 489-493.

Shetreet, E., Palti, D., Friedman, N., & Hadar, U. (2007). Cortical representation of verb processing in sentence comprehension: Number of complements, subcategorization and thematic frames. Cerebral Cortex, 17,1958-1969.

Snijders, T. M., Petersson, K. M., & Hagoort, P. (2010). Effective connectivity of cortical and subcortical regions during unification of sentence structure. Neuroimage, 52,1633-1644.

Snijders, T. M., Vosse, T., Kempen, G., van Berkum, J. J. A., Peterson, K. M., & Hagoort, P. (2009). Retrieval and unification of syntactic structure in sentence comprehension: An fMRI study using word category ambiguity. Cerebral Cortex, 19, 1493-1503.

Stowe, L. A., Broere, C., Paans, A., Wijers, A., Mulder, G., Vaalburg, W., et al. (1998). Localising components of a complex task: Sentence processing and working memory. Neuroreport, 9, 2995-2999.

Stowe, L. A., Haverkort, M., & Zwarts, F. (2005). Rethinking the neurological basis of language. Lingua, 115, 997-1042.

Thompson-Schill, S. L., Bedny, M., & Goldberg, R. F. (2005). The frontal lobes and the regulation of mental activity. Current Opinion in Neurobiology, 15, 219-224.

Thompson-Schill, S. L., D'Esposito, M., Aguirre, G. K., & Farah, M.J. (1997). Role of left inferior prefrontal cortex in retrieval of semantic knowledge: A reevaluation. Proceedings of the National Academy of Sciences USA, 94,14792-14797.

Tian, B., Reser, D., Durham, A., Kustov, A., & Rauschecker, J. P. (2001). Functional specialization in rhesus monkey auditory cortex. Science, 292, 290-293.

Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge, Mass.: Harvard University Press.

Ueno, T., Saito, S., Rogers, T. T., & Lambon Ralph, M. A. (2011). Lichtheim 2: Synthesizing aphasia and the neural basis of language in a neurocomputational model of the dual dorsal-ventral language pathways. Neuron, 72, 385-396.

Ullman, M. T. (2001). A neurocognitive perspective on language: The declarative/ procedural model. Nature Reviews Neuroscience, 2, 717-726.

Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/procedural model. Cognition, 92, 231-270.

Upadhyay, J., Silver, A., Knaus, T. A., Lindgren, K. A., Ducros, M., Kim, D. S., et al. (2008). Effective and structural connectivity in the human auditory cortex. The Journal of Neuroscience, 28, 3341-3349.

van de Meerendonk, N., Indefrey, P., Chwilla, D. J., & Kolk, H. H.J. (2011). Monitoring in language perception: Electrophysiological and hemodynamic responses to spelling violations. Neuroimage, 54, 2350-2363.

van den Brink, D., Brown, C. M., & Hagoort, P. (2001). Electrophysiological evidence for early contextual influences during spoken-word recognition: N200 versus N400 effects. Journal of Cognitive Neuroscience, 13, 967-985.

van den Brink, D., Brown, C. M., & Hagoort, P. (2006). The cascaded nature of lexical selection and integration in auditory sentence processing. Journal of Experimental Psychology: Learning, Memory, & Cognition, 32, 364-372.

van den Brink, D., & Hagoort, P. (2004). The influence of semantic and syntactic context constraints on lexical selection and integration in spoken-word comprehension as revealed by ERPs. Journal of Cognitive Neuroscience, 16,1068-1084.

Van Dyke, J. A., & McElree, B. (2006). Retrieval interference in sentence comprehension. Journal of Memory and Language, 55,157-166.

Van Valin, R. D. Jr., (2005). Exploring the syntax-semantics interface. Cambridge: Cambridge University Press.

Van Valin, R. D., Jr., & LaPolla, R. (1997). Syntax: Form, meaning and function. Cambridge: Cambridge University Press.

Vigliocco, G., Vinson, D. P., Druks, J., Barber, H., & Cappa, S. F. (2011). Nouns and verbs in the brain: A review of behavioural, electrophysiological, neuropsychological and imaging studies. Neuroscience and Biobehavioral Reviews, 35, 407-426.

Vosse, T., & Kempen, G. A. M. (2000). Syntactic assembly in human parsing: A computational model based on competitive inhibition and lexicalist grammar. Cognition, 75, 105-143.

Wagers, M., & McElree, B. (2011). Memory for linguistic features and the focus of attention: Evidence for the dynamics of agreement. Unpublished Manuscript, University of California Santa Cruz and New York University.

Wise, R. J. S., Scott, S. K., Blank, S. C., Mummery, C. J., Murphy, K., & Warburton, E. A. (2001). Separate neural subsystems within Wernicke's area. Brain, 124, 83-95.

Xu, J., Kemeny, S., Park, G., Frattali, C., & Braun, A. (2005). Language in context: Emergent features of word, sentence, and narrative comprehension. Neuroimage, 25,1002-1015.

Ye, Z., Habets, B., Jansma, B. M., & Münte, T. F. (2011). Neural basis of linearization in speech production. Journal of Cognitive Neuroscience, 23, 3694-3702.

Zaehle, T., Geiser, E., Alter, K., Jancke, L., & Meyer, M. (2008). Segmental processing in the human auditory dorsal stream. Brain Research, 1220,179-190.