摘要:The various meanings of discourse connectives like while and however are difficult to
identify and annotate, even for trained human annotators. This problem is all the more
important since connectives are salient textual markers of cohesion and need to be
correctly interpreted for many Natural Language Processing applications. In this paper, we
suggest an alternative route to reach a reliable annotation of connectives, by making use of
the information provided by their translation in large parallel corpora. This method thus
replaces the difficult explicit reasoning involved in traditional sense annotation by an
empirical clustering of the senses emerging from the translations. We argue that this
method has the advantage of providing more reliable reference data than traditional sense
annotation.