Abstract
Annotating coherence relations is a difficult task that requires detailed annotation schemes and well-trained annotators. Existing discourse-annotated corpora such as the Penn Discourse Treebank, RST Treebank, and the TüBa-D/Z corpus all have different annotation manuals: not only do these corpora differ in the types of relations they distinguish, but also in their segmentation rules and even in their definition of what constitutes a coherence relation. Another problem is caused by the fact that coherence relations can, but need not, be made linguistically explicit by means of connectives (because, if) or cue phrases (as a result, despite the fact that). The absence of a connective seems to introduce additional complications to the annotation process. Implicit coherence relations leave annotators with less evidence pointing toward a particular relation and the locating of a coherence relation becomes in itself a potential source of disagreement. In the different discourse-annotated corpora there is even less consensus on how to locate and annotate implicit relations than explicit relations.
In this presentation, we argue that parallel corpora are useful tools for locating, annotating, and researching the characteristics of implicit coherence relations. We used directional corpora extracted from the Europarl corpus (Koehn 2005; Cartoni, Zufferey & Meyer 2013a) and manually spotted cases of implicit translations using the translation spotting method (Cartoni, Zufferey, & Meyer 2013b) across four target languages (French, German, Dutch and Spanish). Conversely, we spotted implicit relations in English source texts that were explicitated in (one of) the target texts. Finally, we spotted explicitations and implicitatons of the English connectives in translated texts from the same four languages, now functioning as the source languages. We then annotated the English discourse relations using the set of basic features defined by Sanders, Spooren & Noordman (1992). Our results indicate that the basic features of coherence relations conveyed by connectives helps to predict their explicit vs. implicit translation across languages.
Original language | English |
---|---|
Publication status | Published - 27 Jan 2015 |
Event | TextLink First Action Conference - Louvain-la-Neuve, Belgium Duration: 26 Jan 2015 → 28 Jan 2015 |
Conference
Conference | TextLink First Action Conference |
---|---|
Country/Territory | Belgium |
City | Louvain-la-Neuve |
Period | 26/01/15 → 28/01/15 |