期刊名称:The World of Computer Science and Information Technology Journal
印刷版ISSN:2221-0741
出版年度:2015
卷号:5
期号:3
页码:10
语种:English
出版社:WCSIT Publishing
摘要:Text summarization is the process of creating a short description of a specified text while preserving its information context. This paper tackles Arabic text summarization problem. The semantic redundancy and insignificance will be removed from the summarized text. This can be achieved by checking the text entailment relation, and lexical cohesion. Accordingly, a text summarization approach (called LCEAS) based on lexical cohesion and text entailment relation is developed. In LCEAS, text entailment approach is enhanced to suit Arabic language. Roots and semantic-relations are used between the senses of the words to extract the common words. New threshold values are specified to suit entailment based segmentation for Arabic text. LCEAS is a single document summarization, which is constructed using extraction technique. To evaluate LCEAS, its performance is compared with previous Arabic text summarization systems. Each system output is compared against Essex Arabic Summaries Corpus (EASC) corpus (the model summaries), using Recall-Oriented Understudy for Gisting Evaluation (ROUGE) and Automatic Summarization Engineering (AutoSummEng) metrics. The outcome of LCEAS indicates that the developed approach outperforms the previous Arabic text summarization systems. Keywords- Text Summarization; Text Segmentation; Lexical Cohesion; Text Entailment; Natural Language Processing.