首页    期刊浏览 2024年09月21日 星期六
登录注册

文章基本信息

  • 标题:A Framework for Building Applications Based on Hidden Topics with Short and Sparse Web Documents
  • 本地全文:下载
  • 作者:Kanimozhiveena E ; D. Ramya Dorai
  • 期刊名称:International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
  • 印刷版ISSN:2278-1323
  • 出版年度:2013
  • 卷号:2
  • 期号:3
  • 页码:984-988
  • 出版社:Shri Pannalal Research Institute of Technolgy
  • 摘要:The main aim of this paper is to provide an approach for resolving two major issues in the web such as (1) data sparseness and (2) synonymy of the data. This paper provides a model that could reduce the data sparseness and the synonymy issues. To attain this objective, here the external data from users is taken. This external data helps to reduce both the mentioned issues. The external data is taken into consideration along with the dataset to reduce the data sparseness. It is because if a document that has more relevant content in it but, with very few sentences present in it, related to the keyword given in the query space, then the classification is not likely to be done perfectly. In this case, to classify such sparse and short documents more accurately, we use external data where the document may contain very few sentences and very fewer keywords present it and then enhance classification. In advertising, the ad messages and web pages are considered. Semantic similarity is measured between the ad messages and the web pages for their matching and ranking.
  • 关键词:classification; data sparseness; matching/ranking; text ; categorization; semantic similarity; web mining
国家哲学社会科学文献中心版权所有