首页    期刊浏览 2024年11月26日 星期二
登录注册

文章基本信息

  • 标题:Scale the Active Influence Based Investigation Using Materialized Sub Graphs
  • 本地全文:下载
  • 作者:Amjan Shaik ; Nazeer Shaik ; Amtul Mubeena
  • 期刊名称:International Journal of Computer Science and Information Technologies
  • 电子版ISSN:0975-9646
  • 出版年度:2011
  • 卷号:2
  • 期号:3
  • 页码:1358-1363
  • 出版社:TechScience Publications
  • 摘要:The development of Information Technology has generated large amount of databases and huge data in various areas. The research in databases and information technology has given rise to an approach to store and manipulate this precious data for further decision making. Data mining is a process of extraction of useful information and patterns from huge data. It is also called as knowledge discovery process, knowledge mining from data, knowledge extraction or data /pattern analysis. In this paper active authority-based keyword search algorithms, such as ObjectRank and personalized PageRank, leverage semantic link information to provide high quality, high recall search in databases, and the web. Conceptually, these algorithms require a querytime PageRank-style iterative computation over the full graph. This computation is excessively expensive for large graphs and not realistic at query time. Alternatively, building an index of precomputed results for some or all keywords involves very expensive preprocessing. Now we demonstrated BinRank, a system that approximates ObjectRank results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running ObjectRank on only one of the subgraphs. BinRank generates the subgraphs by partitioning all the terms in the corpus based on their co-occurrence, executing ObjectRank for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive non-negligible scores. The perception is that a subgraph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that BinRank can achieve subsecond query execution time on the english wikipedia data set, while producing high-quality search results that closely approximate the results of ObjectRank on the original graph. The Wikipedia link graph contains about 110 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of BinRank.
  • 关键词:Scaling; Subgraphs; Binbank; ObjectRank; PageRank.ess.m.
国家哲学社会科学文献中心版权所有