摘要:The method for filtering information from large volumes of text is called Information Extraction. It is a limited task than understanding the full text. In full text understanding, we express in an explicit fashion about all the information in a given text. But, in Information Extraction, we delimit in advance, as part of the specification of the task and the semantic range of the result. Only extractive summarization method is considered and developed for the study. In this article a model for summarization from large documents using a novel approach has been proposed by considering one of the South Indian regional languages (Kannada). It deals with a single document summarization based on statistical approach. The purpose of summary of an article is to facilitate the quick and accurate identification of the topic of the published document. The objective is to save prospective readers’ time and effort in finding the useful information in a given huge article. Various analyses of results were also discussed by comparing it with the English language.
关键词:Information extraction; extractive summarization; automatic text;summarization; text summarization; stemming; word count frequency; UTF-8.