LSA.308: Computational Psycholinguistics
July 2007

Instructor: Roger Levy
Office: Margaret Jacks Hall (building 460), room 022
Office hours: Monday 3:30-4:30pm, Tuesday 5:30-6:30pm
Time: Tuesday and Friday 10:15am-12pm
Classroom: Margaret Jacks Hall (building 460), room 126
Email: rlevy@ling.ucsd.edu
Class webpage: http://ling.ucsd.edu/~rlevy/lsa308/index.html

This is the course website for the LSA Summer Institute course Computational Psycholinguistics at Stanford University. This course is a reading seminar covering a variety of computational modeling approaches to human language comprehension, production, acquisition, and representation. There is a strong emphasis on probabilistic approaches: at its core, the processing of natural language involves dealing with uncertainty all the time, and in psycholinguistic research probability theory is playing a larger and larger role in modeling how people deal with this uncertainty.

Prerequisites

You should have taken Mathematical Refresher for Computational Linguistics.

Requirements

The requirements for participation in this seminar are that you show up, participate in discussion, and (if you are taking the course for credit) turn in a few brief homework assignments given throughout the course.

Class schedule

This schedule is tentative and rest assured that it will be changed at least somewhat. You are encouraged to suggest additional readings on the topics listed below, or on topics that don't appear but you're interested in.
Date Topic & Reading Materials
Friday
6 July 2007
Introduction, history, and computational model of working memory
  • Yngve, V. (1960). A model and an hypothesis for language structure. In Proceedings of the American Philosophical Society, pages 444-466.
    [ .pdf ]
Slides
Tuesday
10 July 2007
Probabilistic sentence comprehension
  • Jurafsky, D. (1996). A probabilistic model of lexical and syntactic access and disambiguation. Cognitive Science, 20(2):137-194.
    [ http ]
Homework 1
Friday
13 July 2007
Surprisal-based sentence comprehension
  • Hale, J. (2001). A probabilistic Earley parser as a psycholinguistic model. In Proceedings of NAACL, volume 2, pages 159-166.
    [ .pdf ]
  • Levy, R. (2007). Expectation-based syntactic comprehension. Cognition. In press.
    [ .pdf ]
Handout
Homework 2
Tuesday
17 July 2007
Computational approaches to lexical access & word reading
  • Norris, D. (2006). The Bayesian reader: Explaining word recognition as an optimal Bayesian decision process. Psychological Review, 113(2):327-357.
    [ http ]
Handout
Homework 3
Friday
20 July 2007
Computational approaches to word segmentation/lexicon learning
  • Goldwater, S., Griffiths, T. L., and Johnson, M. (2007). Distributional cues to word segmentation: Context is important. In Proceedings of the 31st Boston University Conference on Language Development.
    [ .pdf ]
  • Frank, M. C., Goldwater, S., Mansinghka, V., Griffiths, T., and Tenenbaum, J. (2007). Modeling human performance in statistical word segmentation. In Proceedings of CogSci.
    [ .pdf ]
Homework 4
Tuesday
24 July 2007
Computational approaches to semantic acquisition
  • Landauer, T. K. and Dumais, S. T. (1997). A solution to Plato's problem: The Latent Semantic Analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2):211-240.
    [ .pdf ]
  • Griffiths, T. L., Steyvers, M., and Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological Review, 114(2):211-244.
    [ .pdf ]
Handout
Latent Semantic Analysis Slides
Earley Parsing slides
Friday
27 July 2007
Probabilistic effects in sentence production
  • Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., Gregory, M., and Gildea, D. (2003). Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. Journal of the Acoustical Society of America, 113(2):1001-1024.
  • Levy, R. and Jaeger, T. F. (2006). Speakers optimize information density through syntactic reduction. In Advances in Neural Information Processing Systems.
    [ .pdf ]

Readings and other references

It will be assumed that you have done the listed readings before class and are ready to discuss them!

Homework

Homework 1 (due 13 July 2007)
Homework 2 (due 17 July 2007)
Homework 3 (due 20 July 2007)
Homework 4 (due 24 July 2007)

Homework Solutions

Homework 1
Homework 2
Homework 3 (R code)
Homework 4

Software

Here is some related software that could be useful for investigating some of the models we'll cover in the class:

A prefix probability parser, related to the section on information-theoretic models. To use this parser you will need to install Java (version 1.4 or later) on your computer.

The topic modeling toolbox that Griffiths, Steyvers, and Tenenbaum (in press) used.