首页    期刊浏览 2025年04月30日 星期三
登录注册

文章基本信息

  • 标题:Detecting a Multi-Level Content Similarity from Microblogs based on Community Structures and Named Entities
  • 本地全文:下载
  • 作者:Phuvipadawat, Swit ; Murata, Tsuyoshi
  • 期刊名称:Journal of Emerging Technologies in Web Intelligence
  • 印刷版ISSN:1798-0461
  • 出版年度:2011
  • 卷号:3
  • 期号:1
  • 页码:11-19
  • DOI:10.4304/jetwi.3.1.11-19
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:This paper presents a method for finding the content similarity for microblogs. In particular, we process data from Twitter for a breaking news detection and tracking application. The goal is to find a collection of similar messages. The method gives two levels of collections. In the first level, similarity is defined by TF-IDF. Since contents in microblogs have short lengths, we emphasize on specific terms called named entities. Message groups are obtained in the first level. In the second level, we construct a network from the message groups and named entities and perform a community detection. We evaluate and visualize the community results based on several community detection algorithms. We demonstrate that this method can be used to explore similar messages with results in both tightly and loosely coupled manners.
  • 关键词:Twitter; Topic Detection and Tracking; Information Retrieval; Network Analysis
国家哲学社会科学文献中心版权所有