出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:In today's world redundancy is the most vital problem faced in almost all domains. Novelty detection is the identification of new or unknown data or signal that a machine learning system is not aware of during training. The problem becomes more intense when it comes to "Research Articles". A method of identifying novelty at each sections of the article is highly required for determining the novel idea proposed in the research paper. Since research articles are semi-structured, detecting novelty of information from them requires more accurate systems. Topic model provides a useful means to process them and provides a simple way to analyze them. This work compares the most predominantly used topic model- Latent Dirichlet Allocation with the hierarchical Pachinko Allocation Model. The results obtained are promising towards hierarchical Pachinko Allocation Model when used for document retrieval