首页    期刊浏览 2024年07月05日 星期五
登录注册

文章基本信息

  • 标题:Content-Structure Correspondence: A Generic Representation for Heterogeneous Structured Document
  • 本地全文:下载
  • 作者:Saravadee Sae Tan ; Saravadee Sae Tan ; Enya Kong Tang
  • 期刊名称:Procedia - Social and Behavioral Sciences
  • 印刷版ISSN:1877-0428
  • 出版年度:2011
  • 卷号:27
  • 页码:226-232
  • DOI:10.1016/j.sbspro.2011.10.602
  • 语种:English
  • 出版社:Elsevier
  • 摘要:AbstractThis on the web, most structured document collections consist of documents from different sources and marked up with different types of structures. The diversity of structures has lead to the emergence of heterogeneous structured documents. The heterogeneity of structured documents poses new challenges for document representation in structured document retrieval. The representation model needs to handle various types of structures as well as multiple structures in a single document. Furthermore, same information may be represented in different structures and information contained in different documents may be partial and inconsistent. Therefore, the linkage of semantically related elements in the document collections needs to be modelled in the representation model. In this paper, we introduce a generic and flexible structured document model to represent heterogeneous structured documents as well as the similar correspondences in the document collections.
  • 关键词:Parsing;Subcategorization;PP attachment;Coordination attachment;Text understanding;Grammar writing
国家哲学社会科学文献中心版权所有