期刊名称:International Journal of Hybrid Information Technology
印刷版ISSN:1738-9968
出版年度:2015
卷号:8
期号:12
页码:269-282
DOI:10.14257/ijhit.2015.8.12.20
出版社:SERSC
摘要:A large number of electronic documents are labeled using human-interpretable annotations. High-efficiency text mining on such data set requires generative model that can flexibly comprehend the significant of observed labels while simultaneously uncovering topics within unlabeled documents. This paper presents a novel and generalized on-line labeled topic model based on global and local topic (GL-OLT) tracking the time evolution of topics in a sequentially organized multi-labeled corpus. GL-OLT topic model has an incrementally update principle based on time slices by an on-line fashion, and each label has not only a set of local topics, but also has several global topics. Empirical results are presented to demonstrate significant improvements accuracy of label predictive, and lower perplexity and high performance of our proposed model when compared with other models
关键词:Text Information Processing; Latent Dirichlet Allocation (LDA); Topic ; Modeling; Natural Language Processing