期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2013
卷号:4
期号:1
页码:649-653
语种:English
出版社:Ayushmaan Technologies
摘要:With the remarkable growth of information obtainable to end users through the web, search engines come to play ever a more significant role. The search engines sometimes give disappointing search results for lack of any classification of search. Credibility of information should be a key-metric for search page results. Existing search algorithms, such as Object Rank and personalized Page Rank are used to provide high quality, high recall search in Web databases but they have huge computation overhead over full graph and are not feasible at query time. Later, BinRank system was developed that approximates Object Rank a result was developed earlier. BinRank generates the materialized sub graphs(MSG) by partitioning all the terms in the corpus(information results) based on their co-occurrence, and then executing Object Rank for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive nonnegligible scores. But the limitations of Object Rank concerning the query time still persists. In proposed system, we use HubRank a Query optimization and index management technique especially for graphs as an alternate to Object Rank. Along with the hub rank, we use bin rank. A subgraph contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. This approach achieves better results than existing search techniques. So, the proposed system reduces the query time and has significant performance boost over smaller graphs such as MSG.
关键词:Text Classification;Natural Language Processing,Feature Extraction;Concept Mining;Fuzzy Similarity Analyzer; Dimensionality Reduction