首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:Clustered Distributed Index for Efficient Text Retrieval Using Threads
  • 本地全文:下载
  • 作者:M. Basavaraju ; R. Prabhakar
  • 期刊名称:International Journal of Grid Computing & Applications
  • 印刷版ISSN:2229-3949
  • 电子版ISSN:0976-9404
  • 出版年度:2010
  • 卷号:1
  • 期号:2
  • DOI:10.5121/ijgca.2010.12011
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:In this research paper, a novel method of improving the clustered distributed indices for efficient text retrieval using threads is presented. In text retrieval, text search refers to a technique of searching stored document or database. In a full text search, the search engine examines all the words in every stored document as it tries to match search words supplied by the user. When dealing with a small number of documents, the full-text search engine performs a serial scan, where it directly scans the contents of the documents with each query. When the number of documents to search is potentially large or the quantity of search queries to perform is substantial, the problem of full text search is often divided into two tasks, viz., indexing and searching. The indexing stage scans for text of all the documents and builds a list of search terms, often called an index. In the search stage, when performing a specific query, only the index is referenced rather than the text of the original documents. Considering all the above mentioned criterias, this paper aims at improving the search time on the index, by clustering the index. Threads are used to perform a parallel search on each of these clusters. The algorithm developed in C has been tested on various sizes of data and queries and compared with the sequential search method. The depicted results shown in the result section clearly show that this approach improves the search time significantly & the method proposed shows the efficacy, effectiveness, which can be further implemented for real time applications
  • 关键词:Clustering; Distributed index; Threads; Text retrieval; Posting list; Query processing; Algorithms; ;Performance
国家哲学社会科学文献中心版权所有