首页    期刊浏览 2024年11月26日 星期二
登录注册

文章基本信息

  • 标题:Segregation of Code-Switching Sentences using Rule-Based Technique
  • 本地全文:下载
  • 作者:Emaliana Kasmuri ; Halizah Basiron
  • 期刊名称:International Journal of Advances in Soft Computing and Its Applications
  • 印刷版ISSN:2074-8523
  • 出版年度:2020
  • 卷号:12
  • 期号:1
  • 页码:49-64
  • 出版社:International Center for Scientific Research and Studies
  • 摘要:Code-switching sentence contains a mixture of two or more languages within a single constructed sentence. Code-switching is a new trend of language that is widely used in open platform such as blogs and social medias. Consequently, code-switching which has become a new challenge to natural language processing (NLP). The challenge is due to the limitation of the existing NLP systems which were designed for mono-lingual system. Therefore, a new NLP system is needed to deal with code-switching sentences. However, system that segregate code-switching sentences from mono-lingual sentences must be developed prior to the code-switching sentences are used in the NLP systems. This paper considers the segregation is essential because firstly the current NLP systems deals only with mono-lingual sentences. Secondly the current NLP systems treats switching words as meaningless thus will lead to inaccurate result. This paper segregates code-switching sentences from mono-lingual sentences using rule-based technique and dictionaries. This paper used the ratio of word presence to segregate the sentences. The rule-based technique performed with accuracy of more than 87.00% for Malay-English code-switching (MY-EN-CS) sentences.
  • 关键词:code-switching sentence; mono-lingual sentence; rule-based technique; sentence segregation
国家哲学社会科学文献中心版权所有