首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:Utilization of External Knowledge for Personal Name Disambiguation
  • 本地全文:下载
  • 作者:Quang Minh VU ; Atsuhiro TAKASU ; Jun ADACHI
  • 期刊名称:Progress in Informatics
  • 印刷版ISSN:1349-8614
  • 电子版ISSN:1349-8606
  • 出版年度:2009
  • 期号:06
  • DOI:10.2201/NiiPi.2009.6.3
  • 出版社:National Institute of Informatics
  • 摘要:

    The amount of information on the World Wide Web (WWW) is increasing at an explosive rate, and the role of computer systems in processing such a huge amount of data has become crucial. In this paper, we focus on the name disambiguation problem when searching for people, because information about people is an important part of the web and improvements to personal information may benefit many web citizens. The name ambiguity problem occurs frequently when searching for people, because a name may be shared by several people. In this research, we use external knowledge while solving this problem, so that we can analyze information in web documents more easily. We collect web directories and use the latent Dirichlet allocation method to extract latent topics from web directories. The extracted topics are used to modify the search result documents so that important contexts that help to discriminate people can be recognized more easily. We carried out experiments with real web documents and verified the advantages of our approach over other disambiguation approaches that use the vector space model and named entity recognition methods.

  • 关键词:Personal name disambiguation; knowledge base; latent Dirichlet allocation; latent topic extraction; document similarity
国家哲学社会科学文献中心版权所有