期刊名称:Journal of Theoretical and Applied Computer Science
印刷版ISSN:2299-2634
电子版ISSN:2300-5653
出版年度:2012
卷号:6
期号:4
页码:7-23
出版社:Polska Akademia Nauk * Oddzial w Gdansku, Komisja Informatyki,Polish Academy of Sciences, Gdansk Branch, Computer Science Commission
摘要:An investigation into the extraction of useful information from the free text element of questionnaires,
using a semi-automated summarisation extraction technique, is described. The summarisation
technique utilises the concept of classification but with the support of domain/human experts
during classifier construction. A realisation of the proposed technique, SARSET (Semi-Automated
Rule Summarisation Extraction Tool), is presented and evaluated using real questionnaire data.
The results of this evaluation are compared against the results obtained using two alternative techniques
to build text summarisation classifiers. The first of these uses standard rule-based classifier
generators, and the second is founded on the concept of building classifiers using secondary data.
The results demonstrate that the proposed semi-automated approach outperforms the other two
approaches considered.
关键词:questionnaire data mining; text summarisation; text classification