首页    期刊浏览 2025年08月27日 星期三
登录注册

文章基本信息

  • 标题:Fingerprinting Lexical Contexts over the Web
  • 作者:Vincenzo Di Lecce ; Marco Calabrese ; Domenico Soldo
  • 期刊名称:Journal of Universal Computer Science
  • 印刷版ISSN:0948-6968
  • 出版年度:2009
  • 卷号:15
  • 期号:4
  • 页码:805-825
  • 出版社:Graz University of Technology and Know-Center
  • 摘要:In this paper a novel technique for identifying lexical contexts in web resources is presented. The basic idea is to consider web site anchortexts as lexicalized descriptions of an individual ontology organized in the form of a graph of concept words. In the search for peculiar semantic patterns, the concept of web minutia (transposed from the forensic domain) is introduced. The proposed technique consists in searching for web minutiae in the analyzed web sites by means of a golden ontology. Web minutiae act as fingerprints for context-specific web resources; in this sense they are a powerful computational tool to identify and categorize the Web. The WordNet database has been used as golden ontology for our experiments on English web documents. WordNet allows for indexing and retrieving word senses and inter-word taxonomical relations like hyponymy and hypernymy. It has proven to be an efficient mediator between web ontologies and context-dependent taxonomies. Our experiments have been carried out on a preliminary data set of several tens of thousand links taken by web sites of thirteen UK universities. Preliminary results seem to confirm the ability of web minutiae to identify lexical contexts across the Web.
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有