Research Spotlight: Rebecca Hwa, Assistant Professor
Congratulations to our Assistant Professor, Rebecca Hwa, for receiving an award from the Faculty Early Career Development Program. The Faculty Early Career Development (CAREER) Program is a Foundation-wide activity that offers the National Science Foundation's most prestigious awards in support of the early career-development activities of those teacher-scholars who most effectively integrate research and education within the context of the mission of their organization.
Title: "CAREER: Robust Parsing for New Domains and Languages."
Abstract: To facilitate linguistic communications, natural language processing (NLP) technologies must be applicable to different languages across different domains. A limitation of many NLP systems is that they do not perform as well on data types that diverge from their training examples. The objective of this project is to increase the robustness and coverage of a fundamental NLP component, the syntactic parser. Specifically, this project explores adaptation methods to extend a standard English parser for processing different domains (e.g., scientific literature, emails) and different languages (e.g., Chinese). Three types of correspondences are considered. First, if coarse-level correspondences are explicit in the data (e.g., bilingual documents), finer-grained correspondences at the word- or phrasal-level may be inferred and semi-supervised learning may be used to transfer domain knowledge across the inferred correspondence. Second, if the correspondences are inexact (e.g., multiple translations of varying quality), the mismatched portions may be identified and transformed to achieve a closer mapping. Third, if the correspondences are indirect, methods for inducing correspondences from non-parallel corpora may be appropriate. Parser adaptation stands to increase the range of NLP applications; examples include: data mining from medical documents and automatic tutoring for non-English speakers. As the project aims to bring together several strands of research, it offers ample research opportunities to graduate and undergraduate students. The algorithmic aspects encourage forming synthesis from areas of semi-supervised learning, relational data modeling, grammar induction, and machine translation; the empirical aspects afford students an arena to hone their skills in good scientific methodologies.
Read the PittChronicle Article, July 21, 2008.
Read the University of Pittsburgh Press Release, May 27, 2008.