出版社:The Japanese Society for Artificial Intelligence
摘要:It is not easy for a user of an Information Retrieval (IR) system to select an appropriate keyword set to represent his or her specific information need. Therefore, many IR systems can modify keyword sets by estimating the user's particular requirement. Even though such IR systems have better retrieval performance, the complicated estimation process entailed by a large number of keywords makes it difficult for a user to understand how the system behaves. Therefore, we used a thesaurus for query expansion. To select an appropriate keyword set, we proposed two concepts: ``adaptive generalization'', which estimates an appropriate generalization level of the given keywords by using relevant document information, and ``purpose-oriented concept structure modification'', which selects relevant keywords from a predefined synonym set in a thesaurus. Because query expansion based on a thesaurus aims to find new keywords that are complementary to the initial keywords, we proposed to use this method to construct a Boolean query formula to represent the user's information need. We proposed a new IR system called ``appropriate Boolean query reformulation for IR with adaptive generalization'' (ABRIR-AG) to support Boolean query formation. In ABRIR-AG, we reformulated a user-given Boolean query by using a small number of relevant documents. Finally, to evaluate its effectiveness, we evaluated ABRIR-AG by using a large-scale test collection containing WWW documents.
关键词:information retrieval ; thesaurus ; generalization ; Boolean model ; World Wide Web