The Gradient
The Gradient: Perspectives on AI
Tal Linzen: Psycholinguistics and Language Modeling

Tal Linzen: Psycholinguistics and Language Modeling

On syntax acquisition in language models, evaluation paradigms, LM representations, and what psycholinguistics and deep learning can do for each other.

In episode 93 of The Gradient Podcast, Daniel Bashir speaks to Professor Tal Linzen.

Professor Linzen is an Associate Professor of Linguistics and Data Science at New York University and a Research Scientist at Google. He directs the Computation and Psycholinguistics Lab, where he and his collaborators use behavioral experiments and computational methods to study how people learn and understand language. They also develop methods for evaluating, understanding, and improving computational systems for language processing.

Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSS
Follow The Gradient on Twitter


  • (00:00) Intro

  • (02:25) Prof. Linzen’s background

  • (05:37) Back and forth between psycholinguistics and deep learning research, LM evaluation

  • (08:40) How can deep learning successes/failures help us understand human language use, methodological concerns, comparing human representations to LM representations

  • (14:22) Behavioral capacities and degrees of freedom in representations

  • (16:40) How LMs are becoming less and less like humans

  • (19:25) Assessing LSTMs’ ability to learn syntax-sensitive dependencies

  • (22:48) Similarities between structure-sensitive dependencies, sophistication of syntactic representations

  • (25:30) RNNs implicitly implement tensor-product representations—vector representations of symbolic structures

  • (29:45) Representations required to solve certain tasks, difficulty of natural language

  • (33:25) Accelerating progress towards human-like linguistic generalization

    • (34:30) The pre-training agnostic identically distributed evaluation paradigm

    • (39:50) Ways to mitigate differences in evaluation

  • (44:20) Surprisal does not explain syntactic disambiguation difficulty

    • (45:00) How to measure processing difficulty, predictability and processing difficulty

    • (49:20) What other factors influence processing difficulty?

  • (53:10) How to plant trees in language models

    • (55:45) Architectural influences on generalizing knowledge of linguistic structure

    • (58:20) “Cognitively relevant regimes” and speed of generalization

    • (1:00:45) Acquisition of syntax and sampling simpler vs. more complex sentences

    • (1:04:03) Curriculum learning for progressively more complicated syntax

    • (1:05:35) Hypothesizing tree-structured representations

  • (1:08:00) Reflecting on a prediction from the past

  • (1:10:15) Goals and “the correct direction” in AI research

  • (1:14:04) Outro


The Gradient
The Gradient: Perspectives on AI
Deeply researched, technical interviews with experts thinking about AI and technology. Hosted, recorded, researched, and produced by Daniel Bashir.