出版社:The Japanese Society for Artificial Intelligence
摘要:Recently, web pages for mobile devices are widely spread on the Internet and a lot of people can access web pages through search engines by mobile devices as well as personal computers. A summary of a retrieved web page is important because the people judge whether or not the page would be relevant to their information need according to the summary. In particular, the summary must be not only compact but also grammatical and meaningful when the users retrieve information using a mobile phone with a small screen. Most search engines seem to produce a snippet based on the keyword-in-context (KWIC) method. However, this simple method could not generate a refined summary suitable for mobile phones because of low grammaticality and content overlap with the page title. We propose a more suitable method to generate a snippet for mobile devices using sentence extraction and sentence compression methods. First, sentences are biased based on whether they include the query terms from the users or words that are relevant to the queries, as well as whether they do not overlap with the page title based on maximal marginal relevance (MMR). Second, the selected sentences are compressed based on their phrase coverage, which is measured by the scores of words, and their phrase connection probability measured based on the language model, according to the dependency structure converted from the sentence. The experimental results reveal the proposed method outperformed the KWIC method in terms of relevance judgment, grammaticality, non-redundancy and content coverage.