首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:To Construct Search Engine Analyzer for Electrical Enterprises Based on Lucene
  • 本地全文:下载
  • 作者:Kehe Wu ; Xia He ; Tingshun Li
  • 期刊名称:Computer and Information Science
  • 印刷版ISSN:1913-8989
  • 电子版ISSN:1913-8997
  • 出版年度:2009
  • 卷号:2
  • 期号:1
  • 页码:137
  • DOI:10.5539/cis.v2n1P137
  • 出版社:Canadian Center of Science and Education
  • 摘要:Normal 0 7.8 ? 0 2 false false false MicrosoftInternetExplorer4 <!-- /* Font Definitions */ @font-face {font-family:??; panose-1:2 1 6 0 3 1 1 1 1 1; mso-font-alt:SimSun; mso-font-charset:134; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:3 135135232 16 0 262145 0;} @font-face {font-family:"\@??"; panose-1:2 1 6 0 3 1 1 1 1 1; mso-font-charset:134; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:3 135135232 16 0 262145 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:""; margin:0cm; margin-bottom:.0001pt; text-align:justify; text-justify:inter-ideograph; mso-pagination:none; font-size:10.5pt; mso-bidi-font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:??; mso-font-kerning:1.0pt;} /* Page Definitions */ @page {mso-page-border-surround-header:no; mso-page-border-surround-footer:no;} @page Section1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.Section1 {page:Section1;} --> /* Style Definitions */ table.MsoNormalTable {mso-style-name:????; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman"; mso-ansi-language:#0400; mso-fareast-language:#0400; mso-bidi-language:#0400;} There are many professional vocabularies in electrical enterprises, and existing analyzer could not fulfill the application when constructing the search engine for electrical enterprises. In this article, we take the operation system of electrical enterprises as the background, and put forward a sort of word segmentation algorithm based on the implementation of vocabulary in order to design the analyzer of search engine which could be applied in electrical enterprises. The analyzer is completed based on the electrical professional dictionary and could solve many unsatisfactory problems of existing analyzer. At the same time, we adopt the method constructing the word tree, and when loading the vocabulary, first construct a words and expressions tree in the memory, and corresponding word could be segmented only by traversing the tree when segmenting word, which could solve the limitation that one maximum word length must be enacted in usual maximum matching algorithm, and largely enhance the efficiency of word segmentation and avoid meaningless matching algorithm. Finally, we compare the analyzer with two interior analyzers in Lucene, and the result indicated that the analyzer was better than the internal analyzer in Lucene whether for time and the efficiency of word segmentation for the application system of electrical enterprise, which proved that the analyzer could fulfill the requirement to construct the search engine for electrical enterprises.
国家哲学社会科学文献中心版权所有