期刊名称:Journal of Emerging Trends in Computing and Information Sciences
电子版ISSN:2079-8407
出版年度:2014
卷号:5
期号:8
页码:588-595
出版社:ARPN Publishers
摘要:Managing digital documents has become a time consuming process due to sheer scale. Most users manage their personal documents by creating logical hierarchical folder structures. This logical structure depends on the user’s assessment of the context of the document. Basic file structuring has not been changed for decades and hierarchical file structure remains the same. But there has been a surge in the usage of digital documents. The scale of use of digital documents has led to information overload where users struggle to process facts already encoded and stored on computers to produce on demand information requests. Document Indexing facilitates fast retrieval. Indexing can be field-based, full-text, or a combination of both. Field-based Indexing focuses on identifying and encoding key terms that can uniquely identify a document as much as possible. Automatic field-based indexers lack building semantic relationships between the terms and the document. Hence, the process is augmented by manual indexing for better results which, in turn, consumes time. On the contrary, full-text indexing stores the entire document in a database and users can search it using any of the term in the text. Full-text indexing often overloads the database and makes retrieval inefficient. This research presents an implementation of a tool that manages semi-structured file collections based on Formal Concept Analysis (FCA). Formal Concept Analysis has been applied in document retrieval to identify a coherent set of terms that can best classify documents in different contexts. The tool considers the context of a document and the user behavior.