Founded in 1966

Distinguished Lecturer Series

Language Modeling and Its Applications

Eugene Charniak

Brown University

Friday, January 13, 2006
10:30am - SENSQ 5317

Refreshments at 10:00am

Hosted by Jan Wiebe

Abstract

Parsing is the problem of mapping a sentence (in, say, English) to a phrase structure. It is important because it gives us a first rough cut at meaning. During the 1990s there was a flurry of new results using statistical techniques that gave us our first robust parsers ready for every-day use. While there has been continued results since then, the practical parsers at the start of 2005 were no better than what has available in 2000. The first part of the talk will recap this ancient history.

The last 12 months, however have seen a dramatic turn-around, with error rates decreasing by 25%. The second and third parts of the talk describe the two techniques responsible for this state of affairs: discriminative reranking and self training. We also show that the latest results seem to be less corpus specific than the previous results. (That is, they carry over to text corpora reasonably different than those upon which they were trained.

Finally we discuss a new parsing paradigm, course-to-find parsing, and present some starry-eyed proposals for radically different views of parsing.

Biography of Speaker

Eugene Charniak is Professor of Computer Science and Cognitive Science at Brown University. He received an A.B. degree in Physics from University of Chicago and a Ph.D. from M.I.T. in Computer Science. He has published four books: Computational Semantics, with Yorick Wilks (1976); Artificial Intelligence Programming (now in a second edition) with Chris Riesbeck, Drew McDermott, and James Meehan (1980, 1987); Introduction to Artificial Intelligence with Drew McDermott (1985); and Statistical Language Learning (1993). He is a Fellow of the American Association of Artificial Intelligence and was previously a Councilor of the organization. His research has always been in the area of language understanding or technologies which relate to it, such as knowledge representation, reasoning under uncertainty, and learning. Over the last few years he has been interested in statistical techniques for language understanding. His research in this area has included work in the subareas of part-of-speech tagging, probabilistic context-free grammar induction, and, more recently, syntactic disambiguation through word statistics, efficient syntactic parsing, and lexical resource acquisition through statistical means.

You are using an older browser that does not support current Web standards. Although this site is viewable in all browsers, it will look much better in a browser that supports Web standards.