期刊名称:International Journal of Data Mining & Knowledge Management Process
印刷版ISSN:2231-007X
电子版ISSN:2230-9608
出版年度:2013
卷号:3
期号:5
DOI:10.5121/ijdkp.2013.3502
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Every day, huge number of news articles are reported and disseminated on the Internet. By generating gist of an article, reader can go through the main topics instead of reading the whole article as it takes much time for reader to read the entire content of the article. An ideal system would understand the document and generate the appropriate theme(s) directly from the results of the understanding. In the absence of natural language understanding system, it is required to design an appropriate system. Gist generation is a difficult task because it requires both maximizing text content in short summary and maintains grammaticality of the text. In this paper we present a statistical approach to generate a gist of a Hindi news article. The experimental results are evaluated using the standard measures such as precision, recall and F1 measure for different statistical models and their combination on the article before pre-processing and after pre-processing.
关键词:Natural language understanding; precision; recall; F1 measure; sentence selection model; text model; ;informative word selection model; statistical model