首页    期刊浏览 2024年07月24日 星期三
登录注册

文章基本信息

  • 标题:Origin of Dynamic Correlations of Words in Written Texts
  • 本地全文:下载
  • 作者:Hiroshi Ogura ; Hiromi Amano ; Masato Kondo
  • 期刊名称:Journal of Data Analysis and Information Processing
  • 印刷版ISSN:2327-7211
  • 电子版ISSN:2327-7203
  • 出版年度:2019
  • 卷号:7
  • 期号:4
  • 页码:228-249
  • DOI:10.4236/jdaip.2019.74014
  • 出版社:Scientific Research Publishing
  • 摘要:In a previous study, we introduced dynamical aspects of written texts by regarding serial sentence number from the first to last sentence of a given text as discretized time. Using this definition of a textual timeline, we defined an autocorrelation function (ACF) for word occurrences and demonstrated its utility both for representing dynamic word correlations and for measuring word importance within the text. In this study, we seek a stochastic process governing occurrences of a given word having strong dynamic correlations. This is valuable because words exhibiting strong dynamic correlations play a central role in developing or organizing textual contexts. While seeking this stochastic process, we find that additive binary Markov chain theory is useful for describing strong dynamic word correlations, in the sense that it can reproduce characteristics of autocovariance functions (an unnormalized version of ACFs) observed in actual written texts. Using this theory, we propose a model for time-varying probability that describes the probability of word occurrence in each sentence in a text. The proposed model considers hierarchical document structures such as chapters, sections, subsections, paragraphs, and sentences. Because such a hierarchical structure is common to most documents, our model for occurrence probability of words has a wide range of universality for interpreting dynamic word correlations in actual written texts. The main contributions of this study are, therefore, finding usability of the additive binary Markov chain theory to analyze dynamic correlations in written texts and offering a new model of word occurrence probability in which common hierarchical structure of documents is taken into account..
  • 关键词:Autocorrelation Function;Autocovariance Function;Word Occurrence;Stochastic Process;Additive Binary Markov Chain
国家哲学社会科学文献中心版权所有