首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:A Novel Approach for Text Categorization of Unorganized data based with Information Extraction
  • 本地全文:下载
  • 作者:Suneetha Manne ; Dr. S. sameen Fatima
  • 期刊名称:International Journal on Computer Science and Engineering
  • 印刷版ISSN:2229-5631
  • 电子版ISSN:0975-3397
  • 出版年度:2011
  • 卷号:3
  • 期号:7
  • 页码:2846-2854
  • 出版社:Engg Journals Publications
  • 摘要:Internet has made a profound change in the lives of many enthusiastic innovators and researchers. The information available on the web has knocked the doors of Knowledge Discovery leading to a new Information era. Unfortunately, most Search Engines provide web content which is irrelevant to the information intended to the browser. Many Text Categorization techniques for web content have been developed, to recognize the given document�s category but failed to make trust worthy results. This paper primarily focuses on web content categorization based on classic summarization technique by enabling the classification at word level. The web document is preprocessed first which involves filtering the content with classical techniques and then is converted into organized data. The organized data is then treated with predefined hierarchical categorical set to identify theexact category.
  • 关键词:Text Categorization; Text Mining; Information Extraction; Feature Term Extraction; Information Retrieval; Pyramidal Model; Term Frequency.
国家哲学社会科学文献中心版权所有