期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2022
卷号:13
期号:3
DOI:10.14569/IJACSA.2022.0130377
语种:English
出版社:Science and Information Society (SAI)
摘要:Although retrieval engines are becoming more and more functional and efficient, they still have the drawback of not being able to locate the relevant documentary granularity, which results in ignoring the structural aspect. In the context of XML document, Information Retrieval Systems allow to return the user’s documentary granules. Several studies have used graphs to represent XML documents. However, in the scope of this research, the semi-structured document’s structure and that of a user’s query can be seen as arborescences composed of a hierarchy of nested elements. By using graph theory, by calculating the structural proximity and especially the intersection between these two arborescences. The article presents a model for structural information retrieval based on graphs. A collection of multimedia documents are randomly extracted from INEX (Initiative for the Evaluation of XML Retrieval) 2010 to validate the approach. The first results shows the interest of such an approach.
关键词:Semi-structured document; XML document; largest common sub-graph; structural Information retrieval