Research

I work with faculty and students within the Linguistics department and across the Language Science Center (LSC) and the Neuroscience and Cognitive Science Program (NACS). We have a 32-channel Neuroscan EEG lab located in the department, a 157-axial-gradiometer KIT MEG system located in a magnetically quiet lab at the Maryland Neuroimaging Center on campus (which I co-direct with Jonathan Simon), and a 3T MRI scanner also located at the MNC.

Current Work

Getting rid of ‘words’ and traditional ‘morphemes’ in psychological models. Many psycholinguistic models take it as a default assumption that ‘lexical’ knowledge of your language takes the form of a dictionary of ‘words’ or ‘lexical items’ that map a conceptual unit to a syntactic unit to a form unit. Comprehension or production processes are therefore often conceived as having two components: ‘accessing’ these lexical items, and encoding relations between them. But these assumptions about the architecture of linguistic knowledge were developed by researchers who mainly worked on English and a small set of other Indo-European languages, and who were further biased by the white-space segmentation used in those writing systems. When one looks across a broader range of languages, the notion of a single domain for stored meaning, syntax, and form appears wholly unsustainable (Haspelmath 2017). If we instead take language production or comprehension to be a process of translating between three structured representations (meaning, syntax, phonology) with their own native primitives and relational structure, we can expect that stored linguistic knowledge will consist of separate sets of mappings from meaning to syntax and from syntax to phonological form, and that these stored mappings can hold between complex relational structures and not just atoms. We are working through the consequences that this ‘non-lexicalist’ architecture has for models of production (Krauska & Lau, 2023) and comprehension (Cuonzo et al., submitted; Yu et al., submitted), and some of our MEG work provides suggestive evidence that comprehension of isolated ‘words’ is qualitatively different from comprehension of words in connected speech (Gaston et al. 2023).

Noun meanings, and conceptual structure. After a long time trying to understand how to cognitively model reference, I finally learned about what are sometimes called ‘sortalist’ approaches, from people like Peter Geach (1963) and John Macnamara (1982). These approaches recognize that nouns seem to be special linguistic devices for dealing with reference to individuals. That may sound obvious, but contrasts with dominant approaches in linguistic semantics and psychology which group together nouns and adjectives (and/or their corresponding concepts) as ‘predicates’, ‘properties’, or ‘features’. I care because I don’t think we can do good work on cognitive models of interpretation until we fix this central part of the semantic theory. Sandeep Prasada’s work from psychology on the non-linguistic conceptual data structures for representing kinds and their instances is a great starting point, which should be guiding our theories about what nouns mean and why they behave the way they do in language (and see Mark Baker’s Lexical Categories for a nice account of noun syntax along these general lines).

Language and longer-term knowledge acquisition. In the last few years I have been developing the hypothesis that the subject-predicate organization of natural language sentences was shaped by the need to map to the data structure of the hippocampus, a key structure for knowledge acquisition of a certain kind which originally evolved to code knowledge about locations in space for goal-directed navigation. I have an early, quixotic manuscript outlining this hypothesis, and I am working on a better version. The idea relates to beautiful early papers on subject and topic by Yuki Kuroda and Tanya Reinhart, as well as relatively unknown work by Leslie McPherson.

Short-term referential indexes across sentences and scenes. The role of the angular gyrus in language comprehension has long been a puzzle, although it frequently ‘lights’ up when people are comprehending sentences or phrases (e.g. Matchin et al. 2019). I suspect that this activity includes a limited capacity working memory system for referential indexes, just like the famous ‘object file’ system that supports visual working memory for scenes, in neighboring inferior parietal cortex. These inferior parietal indexes point to conceptual properties encoded by temporal lobes, and play a crucial role in working memory computations over referents both in vision and in language, as well as ensuring appropriate episodic memory update. We also suspect that some of the sustained negativities observed in EEG language comprehension studies (such as the SAN and the NRef) might be related to sustained negativities associated with object indexes in visual working memory studies (see Cruz Heredia et al. (2021) for some speculation in this direction). We’re beginning to investigate how and when events and entities get indexed across the course of a sentence, and the extent to which these circuits are fully shared across language and vision (Yu_Lau_2023).

Neural basis of syntactic representation. An unsolved problem for cognitive neuroscience is how the brain encodes hierarchical relationships of the kind observed in even simple sentences of human language (see review in Lau (2018)). In 2018 I was awarded several years of NSF funding to investigate the contribution of sustained neural activity to syntactic representation with EEG, MEG, and fMRI, which yielded a set of exploratory but interesting findings: Matchin et al. (2018), Lau & Liao (2018), Cruz Heredia et al. (2021). We are beginning to explore now whether ERP measures of simple phrase production may provide a better means of isolating syntactic representations. In the course of this project though, we realized several major sociological obstacles towards making progress on the neural basis of syntax. One is the dominant standard psychological theories of the ‘lexicon’, as discussed above; if your theory of the parts is off, it’s going to ripple out to mess up your investigation of the relations between the parts. The other one is that a neural-net architecture has been widely assumed across neuroscience for the last 80 years, and neural net architectures definitionally cannot execute procedural operations on structured representations of the kind we see in humans doing arithmetic, language production, or language comprehension. That means that to investigate the neural basis of syntax properly you would need to up-end and replace the foundations of modern neuroscience (see the last note below), and that is a pretty tall order.

Predictive mechanisms in language comprehension. I worked for many years to better understand the generators of an extremely robust and reliable ERP response known as ‘the N400 effect’, so that I could use it as a more precise tool for getting inside the ‘black box’ of real-time language comprehension. It turns out that this measure is not a great indicator of the process of evaluating real-world plausibility, but it is a very sensitive indicator of lexical or (more likely) conceptual predictability (Lau et al. (2016), Lau et al. (2013), Lau et al. (2008)). Note that nowadays I no longer believe that the N400 primarily reflects ‘activation’ of concepts or lexical items; I suspect it rather reflects information transmission between units needed to update estimates of the cause of the sensory data and make those estimates available for further computations. As I come to better appreciate the extremely different parsing problems posed by text and speech, I am now trying to shift more of my comprehension work over to speech. In collaboration with Jonathan Simon’s lab we have used MEG temporal-response function analysis methods to show how both global and local context predictive models for speech can co-exist in different brain regions and contribute to speech perception in parallel (Brodbeck et al. (2022)).

Linguistic knowledge and processing in late second language acquisitionIn collaboration with the larger Language Science community here, some of my work has investigated the problem of real-time comprehension in a late-learned second language. Much of this work, led by Eric Pelzl, used lexical tone as a case study to ask about how and why stored ‘lexical’ mappings for features like tone could be constrained by early language experience, and how online processing may fail to access knowledge that late learners do display on offline tests (Pelzl et al. (2018), Pelzl et al. (2020), Pelzl et al. (2021)).  Nowadays I can see new connections between this earlier work and our current interest in getting away from simplistic notions of ‘words’; lexical tone is one of the many phenomena that belie an atomistic approach to stored phonological wordforms as a collection of phonemes, and force us to consider seriously what type of structured relations they represent (as in Bill Idsardi’s ‘phonological graph’ PFE approach).

Finally, I believe that Randy Gallistel, Hessam Akhlaghpour, and Sam Gershman are correct that much of the information storage in the brain is being done discretely inside single cells, and that neural spiking is a code for transmitting this information between units, not the representation itself. Hearkening back to Ramon y Cajal’s single neuron doctrine, we need to stop thinking about the brain as a unified ‘net’ of dumb automaton units, and instead think about each neuron as an independent organism (as it originally was, evolutionarily), running its own computations and storing and exchanging its results with others. The faster we all work to advance that conceptual revolution in neuroscience, the better our chances of getting to correctly interpret cogneuro data in our lifetime (see this post).