期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2006
卷号:2006
出版社:ACL Anthology
摘要:Probabilistic Latent Semantic Analysis
(PLSA) models have been shown to provide
a better model for capturing polysemy
and synonymy than Latent Semantic
Analysis (LSA). However, the parameters
of a PLSA model are trained using
the Expectation Maximization (EM) algorithm,
and as a result, the trained model
is dependent on the initialization values so
that performance can be highly variable.
In this paper we present amethod for using
LSA analysis to initialize a PLSA model.
We also investigated the performance of
our method for the tasks of text segmentation
and retrieval on personal-size corpora,
and present results demonstrating the efficacy
of our proposed approach.