首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Utilization of external knowledge for personal name disambiguation
  • 作者:Quang Minh VU ; Atsuhiro TAKASU ; Jun ADACHI
  • 期刊名称:Progress in Informatics
  • 印刷版ISSN:1349-8614
  • 电子版ISSN:1349-8606
  • 出版年度:2009
  • 期号:6
  • 页码:15-25
  • DOI:10.2201/NiiPi.2009.6.3
  • 出版社:National Institute of Informatics
  • 摘要:The amount of information on the World Wide Web (WWW) is increasing at an explosive rate, and the role of computer systems in processing such a huge amount of data has become crucial. In this paper, we focus on the name disambiguation problem when searching for people, because information about people is an important part of the web and improvements to personal information may benefit many web citizens. The name ambiguity problem occurs frequently when searching for people, because a name may be shared by several people. In this research, we use external knowledge while solving this problem, so that we can analyze information in web documents more easily. We collect web directories and use the latent Dirichlet allocation method to extract latent topics from web directories. The extracted topics are used to modify the search result documents so that important contexts that help to discriminate people can be recognized more easily. We carried out experiments with real web documents and verified the advantages of our approach over other disambiguation approaches that use the vector space model and named entity recognition methods.
  • 关键词:Personal name disambiguation; knowledge base; latent Dirichlet allocation; latent topic extraction; document similarity
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有