首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:THAI EDU SEGMENTATION USING CLUE MARKERS AND SYNTACTIC INFORMATION FROM SHALLOW PARSER
  • 本地全文:下载
  • 作者:AUTHAPON KONGWAN ; SITI SAKIRA BINTI KAMARUDDIN ; FARZANA BINTI KABIR AHMAD
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2020
  • 卷号:98
  • 期号:18
  • 页码:3853-3869
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Text is one of the useful knowledge sources of a human. Each element in a text has to be analyzed to identify the piece of information and knowledge. EDU is important for NLP applications that need a smaller unit to process rather than a sentence such as text summarization, information extraction, and question answering. Therefore, EDU can be more appropriated than a sentence to extract knowledge and information from the text. This paper presents a pipeline of the process for Thai EDU segmentation from word segmentation to EDU segmentation. The shallow parser is applied to chunk a non-recursive phrase in a text to reveal partial syntactic information for EDU segmentation. And then, syntactic information is utilized to identify and reconstruct the EDU segmentation in text. From the experiment, the results show that the precision, recall, and F1 score are 0.88865, 0.91577, and 0.90200 respectively.
  • 关键词:Word Segmentation;EDU Segmentation;Conditional Random Field;Shallow Parser;Natural Language Processing
国家哲学社会科学文献中心版权所有