MPQA Releases - Corpus and Opinion Recogntion System


MPQA Opinion Corpus
annotated for opinions and sentiments

The MPQA Opinion Corpus contains 535 news articles from a wide variety of news sources manually annotated for opinions and other private states (i.e., beliefs, emotions, sentiments, speculations, etc.). The corpus was initially collected and annotated as part of the summer 2002 NRRC Workshop on Multi-Perspective Question Answering (MPQA) sponsored by ARDA. To learn more about the subjectivity and sentiment research that produced MPQA, please visit Dr. Janyce Wiebe's page of related publications and the CERATOPS site.

To download the most recent version of the MPQA Corpus click here.


OpinionFinder

OpinionFinder is a system that processes documents and automatically identifies subjective sentences as well as various aspects of subjectivity within sentences, including agents who are sources of opinion, direct subjective expressions and speech events, and sentiment expressions. OpinionFinder was developed by researchers at the University of Pittsburgh, Cornell University, and the University of Utah.

In addition to OpinionFinder, we are also releasing the automatic annotations produced by running OpinionFinder on a subset of the Penn Treebank.

To go to the OpinionFinder download page click here.

Please note that OpinionFinder only runs on Linux.


Subjectivity Lexicon

The list of subjectivity clues (the subjectivity lexicon) that is part of OpinionFinder and was supported in part by NSF Grants IIS-0208798 and IIS-0208985, is also available for separate download. These clues were compiled from several sources (see the enclosed README) and were used in:

Theresa Wilson, Janyce Wiebe, and Paul Hoffmann (2005). Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. Proc. of HLT-EMNLP-2005.