首页    期刊浏览 2025年02月17日 星期一
登录注册

文章基本信息

  • 标题:Punjabi Text Classification using Naive Bayes, Centroid and Hybrid Approach
  • 本地全文:下载
  • 作者:Nidhi ; Vishal Gupta
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2012
  • 卷号:2
  • 期号:4
  • 页码:245-252
  • DOI:10.5121/csit.2012.2421
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:Punjabi Text Classification is the process of assigning predefined classes to the unlabelled text documents. Because of dramatic increase in the amount of content available in digital form, text classification becomes an urgent need to manage the digital data efficiently and accurately. Till now no Punjabi Text Classifier is available for Punjabi Text Documents. Therefore, in this paper, existing classification algorithm such as Naïve Bayes, Centroid Based techniques are used for Punjabi Text Classification. And one new approach is proposed for the Punjabi Text Documents which is the combination Naïve Bayes (to extract the relevant features so as to reduce the dimensionality) and Ontology Based Classification (that act as text classifier that used extracted features). These algorithms are performed over 184 Punjabi News Articles on Sports that classify the documents into 7 classes such as ਿਕ?ਕਟ (krikaṭ), ਹਾਕੀ (hākī), ਕਬੱਡੀ (kabḍḍī), ਫੁਟਬਾਲ (phuṭbāl), ਟੈਿਨਸ (ṭainis), ਬੈਡਿਮੰਟਨ (baiḍmiṇṭan), ਓਲ ੰ ਿਪਕ (ōlmpik).
  • 关键词:Punjabi Text Classification; Hybrid Approach; Naïve Bayes; Centroid Based Classification; ;Ontology Based Classification (Domain Specific).
国家哲学社会科学文献中心版权所有