期刊名称:International Journal on Computer Science and Engineering
印刷版ISSN:2229-5631
电子版ISSN:0975-3397
出版年度:2011
卷号:3
期号:3
页码:1063-1067
出版社:Engg Journals Publications
摘要:Title is a compact representation of a document which distill the important information from the document. In this paper we studied the selection words as title words by using different learning approaches namely nearest neighbor approach (NN), Naive Bayes approach with limited-vocabulary (NBL), Naive Bayes approach with full vocabulary (NBF) and by using a term weighing approach (tf-idf). We compare the performance of these approaches by using F1 metric. We compare the F1 metric results both on English Script and Indic Script ' Telugu'. We concluded the influence of linguistic complexity in the process of Title word selection.
关键词:Title; F1 measure; NN approach; NBL approach; NBF approach; tf-idf approach;