Joel Tetreault University of Rochester Research Statement My main area of interest is dialogue processing and building natural language understanding systems. Developing good natural language systems is important for many tasks. Extracting and summarizing information from the web, and collaborating with users on different tasks such as ordering airline tickets or allocating resources in a battle zone, are just a few of the applications that can benefit from natural language processing systems. However, what the human brain can process instinctively and instantaneously is tediously difficult for a computer. In terms of building a conversational system, one has to deal with several modules just to process one sentence from a human speaker. First, a speech recognizer transforms the speech into text, which is then passed to a parser which produces a semantic representation of the text. Then, this output is passed to several modules in an interpretation phase to determine the meaning and intention of what the user uttered. Once this is determined, a proper response is formulated (planning) and speech is output. My dissertation work has focused on the interpretation section of this chain, specifically reference resolution. In a spoken dialog system, the job of a reference module is to identify noun phrases and resolve them to entities evoked in the dialogue. This involves finding antecedents for pronouns such as "that" or "they" and resolving definite noun phrases such as "the two hospitals" or "the ambulance here." Though reference is just one part of the overall interpretation of a sentence, it is a very important piece because failure to resolve the entities in a sentence correctly can lead to an incorrect interpretation of a sentence and thus an erroneous response to the user. My work has focused on developing algorithms for pronoun resolution and implicit roles in spoken dialogue as well as written text. My initial work used parsed Wall Street Journal texts as the testing corpora. I found that by using the depth of nodes in the syntactic parse and the embeddedness of clauses and the information status of entities, one could successfully resolve 80% of the pronouns in the large corpus correctly, which makes it one of the best performing algorithms to date. Unfortunately, syntactic-based methods can only help reference resolution so much. Closing the "20% gap" requires tracking the context correctly, semantic information of potential antecedents and verbs, real world knowledge and reasoning, discourse structure, among others. The second phase of my studies have focused on using these new information sources, which are often difficult to derive, and incorporating them into syntax-based algorithms to improve performance. On the same corpus, I used Rhetorical Structure Theory to segment the text into discourse segments. Though it did not improve accuracy, it did offer a small speedup since some entities were eliminated from the search. My current work deals with spoken dialogue (human-human) and using the algorithm I developed in the Wall Street Journal domain with semantic and discourse information. Spoken dialogue is a much more complex than its written counterpart since people do not always speak with grammatically correct sentences, though people can understand them. Also, spoken sentences are peppered with disfluencies and speech repairs which make it difficult to get reliable parses. As a result, the algorithm I developed only performs at 44%. This shows the need for more information other than syntax. Using a corpus parsed by a broad-coverage deep-parser, I am able to use semantics as a filter on possible candidates. Secondly, discourse cues such as "so" and "then", as well as questions, offer clues as to where segments begin and end. Incorporating these two pieces of information into the algorithm boosts performance significantly to close to 60%, which is the best value reported to date in literature. This algorithm is now being used in the TRIPS dialogue system. As stated earlier, my work has focused mainly on developing algorithms for making spoken dialogue systems more reliable. I have also explored using reference resolution to help out in two other areas of natural language processing: incremental parsing and semantic role labeling. Most dialogue systems process input from the user in a post-processing manner, that is, a stream of words is processed in its entirety only after the stream is complete. However, humans process streams word by word as opposed to waiting until the whole sentence has been uttered. This incremental understanding is best seen in how people will start tasks or focus on objects as soon as they are uttered, and how they can make interruptions midway through a stream for a clarification. There has been much work that shows parsing a sentence incrementally can actually improve parser accuracy and speed since parses can be pruned as the sentence is processed. Reference resolution is useful to this task because constituents produced by the parser can be verified by consulting a reference oracle which maintains a discourse history and context and checks if the entity proposed by the parser exists or makes sense given the current state of the dialogue. By rejecting implausible constituents, the parser saves time by not exploring erroneous paths. Incremental parsing is good for dialogue systems, not only because of the speedup benefits, but also since it allows for a more natural interaction with the user. Another area is semantic role labeling. In this task, the goal is to label the correct sense of a verb and its roles given the noun phrases from the rest of the sentence. Most models use statistical methods to label, trained from large corpora. In my work, the corpora consists of newspaper articles which have hundreds of instances of pronouns. Not resolving them leads to skewed training data. In addition to reference and discourse, I have also worked in information retrieval. The project involved developing algorithms to detect affect in transcribed conversations between married couples. Psychologists at Strong Memorial Hospital in Rochester, NY used a tool based on the tagging methods developed to annotate their data much more quickly and sometimes more reliably than if they had done it themselves. They used the tagging system to determine correlations between marital satisfaction and coping with serious illness. The work helps them better tailor therapy for couples. I presented the work at this year's AAAI Spring Symposium and am interested in researching more in this area. In a dialogue system, the reference resolution module is one of several that work together in the interpretation phase. Over the next five years, I would like to expand my work to encompass these areas, as well as continue to work on refining reference resolution and using it to aid in other natural language problems. I am especially interested in discourse analysis, in particular, detecting discourse segments automatically though cues and also through inferencing procedures. This work, along with reference resolution can benefit other work such as text summarization and spoken dialogue systems.