摘要:Linked data has been widely recognized as an important paradigm for
representing data and one of the most important aspects of supporting its use is
discovery of links between datasets. For many datasets, there is a significant amount
of textual information in the form of labels, descriptions and documentation about
the elements of the dataset and the fundament of a precise linking is in the application
of semantic textual similarity to link these datasets. However, most linking tools so
far rely on only simple string similarity metrics such as Jaccard scores. We present
an evaluation of some metrics that have performed well in recent semantic textual
similarity evaluations and apply these to linking existing datasets.