Rules for Annotating Agent Spans

  1. Every unique agent referred to in the text should be assigned ONLY ONE identifier. In other words, out of all agent spans in a document that refer to the U.S. human rights report, only one of them will have the feature, id.

  2. Note that this policy is different from that for targets: if the same entity occurs as a target multiple times in a text, it will be assigned a unique id on each occasion.

  3. Agent ids are case sensitive! If you give an agent an id=AbCdEf then you must type AbCdEf as the id for that agent every time you reference that agent in a nested-source, nested-target, etc.

  4. The id feature should be assigned to the first descriptive reference to the agent. Finding this reference is usually clear-cut but in some cases it's harder because the information that helps one to identify the agent referent is more distributed. Consider this example:

    1. So much for President Bush's effort to repair his legacy on global warming — at least when it comes to one German official with a flair for sloganeering.

      In a statement released today, Environment Minister Sigmar Gabriel described Mr. Bush's speech on Wednesday as ``disappointing.''

    In the second sentence, where the DSE ``described'' occurs, the relevant agent phrase is ``Environment Minister Sigmar Gabriel''. The question is whether one should consider the previous reference to ``one German official with a flair for sloganeering'' as an earlier descriptive reference. Here it seems acceptable to treat only the second mention where the person is identified with their office and name as fully descriptive. The mention in the first sentence would thus not have to be marked as an agent.

  5. When annotating a span of text that references an agent, label the entire noun phrase that is part of the reference. Thus, in the previous example, mark ``Environment Minister Sigmar Gabriel'' rather than only ``Sigmar Gabriel''.



J. Ruppenhofer