出版社:The Japanese Society for Artificial Intelligence
摘要:When users find information about people from the results of Web people searches, they often need to browse many obtained Web pages and check much unnecessary information. This task is time-consuming and complicates the understanding of the designated people. We investigate a method that integrates the useful information obtained from Web pages and displays them to understand people. We focus on curriculum vitae, which are widely used for understanding people. We propose a method that extracts event sentences from Web pages and displays them like a curriculum vita. The event sentence includes both time and events related to a person. Our method is based on the following: (1) extracting event sentences using heuristics and filtering them, (2) judging whether event sentences are related to a designated person by mainly using the patterns of HTML tags, (3) classifying these sentences to categories by SVM, and (4) clustering event sentences including both identical times and events. Experimental results revealed the usefulness of our proposed method.
关键词:curriculum vitae ; web people search ; information extraction ; classification ; clustering