首页    期刊浏览 2024年11月23日 星期六
登录注册

文章基本信息

  • 标题:Representation of textual documents by the approach wordnet and n-grams for the unsupervised classification (clustering) with 2D cellular automata: a comparative study
  • 本地全文:下载
  • 作者:HAMOU Reda Mohamed ; LEHIRECHE Ahmed ; LOKBANI Ahmed Chaouki
  • 期刊名称:Computer and Information Science
  • 印刷版ISSN:1913-8989
  • 电子版ISSN:1913-8997
  • 出版年度:2010
  • 卷号:3
  • 期号:3
  • 页码:240
  • DOI:10.5539/cis.v3n3p240
  • 出版社:Canadian Center of Science and Education
  • 摘要:Normal 0 21 false false false MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Tableau Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman"; mso-ansi-language:#0400; mso-fareast-language:#0400; mso-bidi-language:#0400;} In this article we present a 2D cellular automaton (Class_AC) to solve a problem of text mining in the case of unsupervised classification (clustering). Before to experiment the cellular automaton, we vectorized our data indexing textual documents from the database REUTERS 21,578 by Wordnet approach and the representation of text documents by the method n-grams. Our work is to make a comparative study of two approaches to representation that is the conceptual approach (Wordnet) and the n-grams. Section 1 gives an introduction on the biomimétisme and text mining, Section 2 presents r epresentation of texts based on Wordnet approach and the n grams , Section 3 describes the cellular automaton for clustering, Section 4 shows the experimentation and comparison results and finally Section 5 gives a conclusion and perspectives.
国家哲学社会科学文献中心版权所有