期刊名称:International Journal of Information Technology and Computer Science
印刷版ISSN:2074-9007
电子版ISSN:2074-9015
出版年度:2022
卷号:14
期号:3
DOI:10.5815/ijitcs.2022.03.02
语种:English
出版社:MECS Publisher
摘要:The growth of microblogging sites such as Biomedical, biomedical, defect, or bug databases makes it difficult for web users to share and express their context identification of sequential key phrases and their categories on text clustering applications. In the traditional document classification and clustering models, the features associated with TREC texts are more complex to analyze. Finding relevant feature-based key phrase patterns in the large collection of unstructured documents is becoming increasingly difficult, as the repository's size increases. The purpose of this study is to develop and implement a new hierarchical document clustering framework on a large TREC data repository. A document feature selection and clustered model are used to identify and extract MeSH related documents from TREC biomedical clinical benchmark datasets. Efficiencies of the proposed model are indicated in terms of computational memory, accuracy, and error rate, as demonstrated by experimental results.
关键词:Similarity;Retrieval;Clustering and Classification;Hierarchical Methods;Phrase patterns