期刊名称:TELKOMNIKA (Telecommunication Computing Electronics and Control)
印刷版ISSN:2302-9293
出版年度:2014
卷号:12
期号:4
页码:1045-1052
DOI:10.12928/telkomnika.v12i4.811
出版社:Universitas Ahmad Dahlan
摘要:Tang poetry semantic correlation computing is critical in many applications, such as searching, clustering, automatic generation of poetry and so on. Aiming to increase computing efficiency and accuracy of semantic relatedness, we improved the process of latent semantic analysis (LSA). In this paper, we adopted “representation of words semantic” instead of “words-by-poems” to represent the words semantic, which based on the finding that words having similar distribution in poetry categories are almost always semantically related. Meanwhile, we designed experiment which obtained segmentation words from more than 40000 poems, and computed relatedness by cosine value which calculated from decomposed co-occurrence matrix with Singular Value Decomposition (SVD) method. The experimental result shows that this method is good to analyze semantic and emotional relatedness of words in Tang poetry. We can find associated words and the relevance of poetry categories by matrix manipulation of the decomposing matrices as well.