首页    期刊浏览 2024年10月05日 星期六
登录注册

文章基本信息

  • 标题:Automatically Collecting and Monitoring Japanese Weblogs
  • 本地全文:下载
  • 作者:Tomoyuki NANNO ; Yasuhiro SUZUKI ; Toshiaki FUJIKI
  • 期刊名称:人工知能学会論文誌
  • 印刷版ISSN:1346-0714
  • 电子版ISSN:1346-8030
  • 出版年度:2004
  • 卷号:19
  • 期号:6
  • 页码:511-520
  • DOI:10.1527/tjsai.19.511
  • 出版社:The Japanese Society for Artificial Intelligence
  • 摘要:Weblogs (blogs) are now thought of as a potentially useful information source. Although the definition of blogs is not necessarily definite, it is generally understood that they are personal web pages authored by a single individual and made up of a sequence of dated entries of the author's thoughts, that are arranged chronologically. In Japan, since long before blog software became available, people have written `diaries' on the web. These web diaries are quite similar to blogs in their content, and people still write them without any blog software. As we will show, hand-edited blogs are quite numerous in Japan, though most people now think of blogs as pages usually published using one of the variants of public-domain blog software. Therefore, it is quite difficult to exhaustively collect Japanese blogs, i.e., collect blogs made with blog software and web diaries written as normal web pages. With this as the motivation for our work, we present a system that tries to automatically collect and monitor Japanese blog collections that include not only ones made with blog software but also ones written as normal web pages. Our approach is based on extraction of date expressions and analysis of HTML documents, to avoid having to depend on specific blog software, RSS, or the ping server.
  • 关键词:weblog ; blog ; document analysis ; monitoring
国家哲学社会科学文献中心版权所有