期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2018
卷号:96
期号:8
出版社:Journal of Theoretical and Applied
摘要:Document Summarization is an ongoing research work in the field of Natural Language Processing which will provide a summary which is almost like a summary generated by a human being with the help of NLP tools and techniques. Since the information used across the digital world is exponentially increasing, automatic summarization techniques gained attention especially abstractive methods. But producing an effective abstractive summary, first, the text documents should be represented semantically. From this representation, important sentences must be selected using some strategies and finally the abstractive summary is generated. Representing the sentences in natural language semantically faces many challenges. Various works have been carried out for extracting the semantics of the sentences. Semantic role labeling is a technique in NLP to detect the semantically related arguments of a predicate or verb in a sentence and their grouping into one of the related roles. So this technique can be used to represent the sentences meaningfully and can be further used in different applications such as question answering system, information extraction, summarization, text categorization etc. Currently, limited works are done in Malayalam towards semantic role extraction. Domain based works will give better results. In this paper the semantic roles of important words in Malayalam Web documents pertaining to cricket domain are identified.
关键词:Semantic Role Labeling; Karaka relations; Memory Based Learning; Vibhakthi; Chunking