期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN:2320-9798
电子版ISSN:2320-9801
出版年度:2015
卷号:3
期号:11
DOI:10.15680/IJIRCCE.2015.0311189
出版社:S&S Publications
摘要:Topic filtering (Such as Latent Dirichlet allocation (LDA) and Maximum matched Pattern-based TopicModel MPBTM) provide a suitable way to analyze large number of unclassified text. Pattern mining is an importantresearch area in data mining and knowledge discovery. The data mining concept is used in the field of informationfiltering for generating user’s information needs from a collection of documents. However, the large amount ofdiscovered patterns hinder them from being effectively and efficiently used in real applications, therefore selection ofthe most discriminative and representative semantic patterns from the huge amount of discovered patterns becomescrucial. To deal with the above mentioned problems, here proposed NFA based Maximum matched Pattern-basedTopic Modeling (MPBTM), Enhanced LDA, Open English Natural language processing (NLP) and Gibbs sampling fortopic modeling method. The main features of the proposed model include: (1) each topic is represented by patterns (2)Generate relevant topic document (3) the most discriminative and representative patterns, estimate more informationretrieval from the document library according to the user's information.
关键词:Topic model; information filtering; pattern mining; relevance ranking; user interest model