首页    期刊浏览 2025年02月26日 星期三
登录注册

文章基本信息

  • 标题:A Method for Filtering Pages by Similarity Degree based on Dynamic Programming
  • 作者:Ziyun Deng ; Tingqin He
  • 期刊名称:Future Internet
  • 电子版ISSN:1999-5903
  • 出版年度:2018
  • 卷号:10
  • 期号:12
  • 页码:124
  • DOI:10.3390/fi10120124
  • 语种:English
  • 出版社:MDPI Publishing
  • 摘要:To obtain the target webpages from many webpages, we proposed a Method for Filtering Pages by Similarity Degree based on Dynamic Programming (MFPSDDP). The method needs to use one of three same relationships proposed between two nodes, so we give the definition of the three same relationships. The biggest innovation of MFPSDDP is that it does not need to know the structures of webpages in advance. First, we address the design ideas with queue and double threads. Then, a dynamic programming algorithm for calculating the length of the longest common subsequence and a formula for calculating similarity are proposed. Further, for obtaining detailed information webpages from 200,000 webpages downloaded from the famous website “www.jd.com”, we choose the same relationship Completely Same Relationship (CSR) and set the similarity threshold to 0.2. The Recall Ratio (RR) of MFPSDDP is in the middle in the four filtering methods compared. When the number of webpages filtered is nearly 200,000, the PR of MFPSDDP is highest in the four filtering methods compared, which can reach 85.1%. The PR of MFPSDDP is 13.3 percentage points higher than the PR of a Method for Filtering Pages by Containing Strings (MFPCS).
  • 关键词:method for filtering pages; similarity degree; dynamic programming; combination method method for filtering pages ; similarity degree ; dynamic programming ; combination method
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有