摘要:AbstractGiven a search query, most existing search engines simply return a ranked list of search results. However, it is often the case that those search result documents consist of a mixture of documents that are closely related to various sub- topics. This paper proposes a framework of categorizing blog posts according to their sub-topics. In our framework, the sub-topic of each blog post is identified by utilizing Wikipedia entries as a knowledge source and each Wikipedia entry title is considered as a sub-topic label. We achieve to quickly overview the distribution of sub-topics over the whole collected blog posts.