摘要:Automatic text summarization involves reducing a text document or a
larger corpus of multiple documents to a short set of sentences or paragraphs that
convey the main meaning of the text. In this paper, we discuss about
multi-document summarization that differs from the single one in which the
issues of compression, speed, redundancy and passage selection are critical in
the formation of useful summaries. Since the number and variety of online
medical news make them difficult for experts in the medical field to
read all of the medical news, an automatic multi-document summarization can be
useful for easy study of information on the web. Hence we propose a new approach
based on machine learning meta-learner algorithm called AdaBoost that is used
for summarization. We treat a document as a set of sentences, and the
learning algorithm must learn to classify as positive or negative examples of
sentences based on the score of the sentences. For this learning task, we
apply AdaBoost meta-learning algorithm where a C4.5 decision tree has been
chosen as the base learner. In our experiment, we use 450 pieces of news that are downloaded from
different medical websites. Then we compare our results with some
existing approaches.