出版社:The Japanese Society for Artificial Intelligence
摘要:The link structure of the Web is generally represented by the webgraph, and it is often used for web structure mining that mainly aims to find hidden communities on the Web. In this paper, we identify a common frequent substructure and give it a formal graph definition, which we call an isolated star (i-star), and propose an efficient enumeration algorithm of i-stars. We then investigate the structure of the Web by enumerating i-stars from real web data. As a result, we observed that most i-stars correspond to index structures in single domains, while some of them are verified to be candidates of communities, which implies the validity of i-stars as useful substructure for web structure mining and link spam detecting. We also observed that the distributions of i-star sizes show power-law, which is another new evidence of the scale-freeness of the webgraph.
关键词:isolated star ; link analysis ; scale-freeness ; webgraph ; web structure mining.